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Introduction 


This manual provides an overview of the SimOS programming environment, and 
describes a variety of system facilities, utility commands, and libraries, that are 
of interest primarily to applications developers. 

The first portion of the manual describes system interface and support facilities 
for use with the C programming language: 

□ Chapter 2: SunOS Progr amming 

System interface and standard library support facilities 

D Chapter 3: System V Compatibility Package 

Comparison of BSD 4.x and System V, and SunOS conformance with the 
System V Interface Definition (SVID) 

□ Chapter 4: Shared Libraries 

Overview, system support and development techniques 

The next two chapters describe C programming aids to check for correcmess and 
monitor performance: 

□ Chapter 5: lint — a Program Verifier for C 
Checking programs for internal consistency and portability 

D Chapter 6: Performance Analysis 

Timing, Profiling and Coverage Analysis tools 

The next two chapters describe system utilities for version control and consistent 
compilation: 

D Chapter 7: SCCS — Source Code Control System 

Version control for source files 
a Chapter 8: make User’s Guide 

Consistent compilation for programs and software projects 
The next three chapters describe program-generation tools: 

D Chapter 9: m4 — a Macro Processor 
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Parametric macro-language (pre)processor 

□ Chapter 10: lex — a Lexical Analyzer Generator 
Preprocessor for scanning routines 

□ Chapter 11: yacc — Yet another Compiler Compiler 
Preprocessor for parsing routines 

The last two chapters describe the BSD and System V curses terminal-display 
library routines. 

□ Chapter 12: curses Library: Screen-Oriented Cursor Motions BSD 
curses library routines 

□ Chapter 13: System V curses and terminfo 
System V display library and terminal-capabilities database. 

Appendix A describes low-level commands for SCCS. 

Appendix B summarizes the enhancements made to the SunOS version of the 
make utility. 
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the SunOS Reference Manual. 
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SunOS Programming 


This chapter is an introduction to programming on the SunOS system. The 
emphasis is on how to write programs that make use of system calls and library 
functions. The topics discussed include 

n handling command-line arguments 

□ rudimentary I/O; the standard input and output 
D the standard I/O library; file system access 

D low-level I/O: open, read, write, close, seek 

□ processes: exec, fork, pipes 
n signals — interrupts, etc. 

Section 2.7 — The Standard I/O Library — describes the standard I/O library in 
detail. 

This chapter describes how to write programs that interface with the SunOS 
operating system in a nontrivial way. This includes programs that use files by 
name, that use pipes, that invoke other commands as they run, or that attempt to 
catch interrupts and other signals during execution. It summarizes material that 
is described in detail in the SunOS Reference Manual. 

There is no attempt to be complete; only generally useful material is dealt with. 
It is assumed that you will be programming in C, so you must be able to read the 
language roughly up to the level of The C Programming Language. You should 
also be familiar with SunOS itself. 


2.1. Basics 
Program Arguments 


When a C program is mn as a command, the arguments on the command line are 
made available to the function main () as an argument count argc and an array 
argv of pointers to character strings that contain the arguments. By convention, 
argv [ 0 ] is the command name itself, so argc is always greater than 0. 
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n 


The following program illustrates the mechanism: it simply echoes its arguments 
back to the terminal — this is essentially the e cho () command. 



ar gv is a pointer to an array whose elements are pointers to arrays of characters; 
each is terminated by \ 0, so they can be treated as strings. The program starts by 
printing argv [ 1 ] and loops until it has printed argv [ argc-1 ]. 

The argument count and the arguments arc parameters to main (). If you want 
to keep them around so other routines can get at them, you must copy them to 
external variables. 

2.2. Standard Input and The simplest input mechanism is to read fiom the standard input, which is gen- 

Standard Output erally the user’s terminal. The function get char () returns the next input char¬ 

acter each time it is called. A file may be substituted for the terminal by using 
the < convention (input redirection): if prog uses get char 0 , the command 
line 



makes prog read from the file specified by filename instead of the terminal, 
prog itself need know nothing about where its input is coming from. This is 
also true if the input comes from another program via the pipe mechanism: 



provides the standard input for prog from the standard output (see below) of 
otherprog. 


get char () returns the value EOF when it encounters the end of file (or an 
error) on whatever you arc reading. The value of EOF is normally defined to be 
-1, but it is unwise to take any advantage of that knowledge. As will become 
clear shortly, this value is automatically defined for you when you compile a pro¬ 
gram, and need not be of any concern. 


Similarly, put char (c) puts the character c on the ‘standard output’, which is 
also by default the terminal. The output can be captured on a file by using >: if 
prog uses putchar (), 





Chapter 2 — SunOS Programming 11 


writes the standard output on outpace instead of the terminal, outpuifile is 
created if it doesn t exist; if it already exists, its previous contents are overwrit¬ 
ten. A pipe can be used; 

f:: :::r:r . . 

tutorials prog j otherpxog 

V _ ' _ 

puts the standard output of prog into the standaid input of otherprog. 

The function print f (), which formats output in various ways, uses the same 
mechanism as put char () does, so calls to printfO andputchar() may 
be intermixed in any order; the output will appear in the order of the calls. 

Similarly, the function scant () provides for formatted input conversion; it will 
read the standard input and break it up into strings, numbers, etc., as desired, 
scant () uses the same mechanism as getchar (), so calls to them may also 
be intermixed. 

Many programs read only one input and write one output; for such programs I/O 
with getchar (), putchar (), scant (), and printt () may be entirely 
adequate, and it is almost always enough to get started. This is particularly trae 
if the SunOS pipie facility is used to connect the output of one program to the 
input of the next For example, the following program strips out all ASCII con¬ 
trol characters from its input (except for newline and tab). 

----—-------- 1 _____ 

tinclude <stdio.h> 

() /* ccstrip; strip non—graphic characters */ 

{ 

int c ; 

while ((c = getchar 0) !* EOF) 

if ((c ' && c < 0177) II c =- '\t' I I c =- '\n') 

putchar(c); 

exit (0); 

} 

^_I 

The line 

♦include <stdio.h> 

should appear at the beginning of each source file which does I/O using the stan¬ 
dard I/O functions described in section 3(S) of the SunOS Reference Manual — 
the C compiler reads a file {lusriinclude!stdio.h) of standard routines and sym¬ 
bols that includes the definition of EOF. 

If it is necessary to treat multiple files, you can use cat to collect the files for you: 


tutQrial% cat filel £xla 2 . ... ) ccstrip > output 


and thus avoid learning how to access files from a program. By the way, the call 
to exit 0 at the end is not necessaiy to make the program work properly, but it 
assures that any caller of the program will see a normal tennination status (con¬ 
ventionally 0) from the iM-ogram when it completes. Section 2.5.3 discusses 
returning status in more detail. 
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2.3. The Standard I/O 
Library 


Accessing Files 


The ‘Standard I/O Library’ is a collection of routines intended to provide 
efficient and portable I/O services for most C programs. The standard I/O library 
is available on each system that supports C, so programs that confine their system 
interactions to its facilities can be transported from one system to another essen¬ 
tially without change. 

This section discusses the basics of the standard I/O library. Section 2.7 — The 
Standard HO Library — contains a more complete description of its capabilities 
and calling conventions. 

The above programs have all read the standard input and written the standard 
output, which we have assumed are magically predefined. The next step is to 
write a program that accesses a file that is not already cormected to the program. 
One simple example is wc, which counts the lines, words and characters in a set 
of files. For instance, the command 



displays the number of lines, words and characters in x. c and y. c and the 
totals. 


The question is how to arrange for the named files to be read — that is, how to 
connect the filenames to the I/O statements which actually read the data. 

The rules are simple — you have to open a file by the standard library function 
fopenO before it can be read from or written to. fopenO takes an external 
name (like x. c or y. c), does some housekeeping and negotiation with the 
operating system, and returns an internal name which must be used in subsequent 
reads or writes of the file. 

This internal name is actually a pointer, called a file pointer, to a structure which 
contains information about the file, such as the location of a buffer, the current 
character position in the buffer, whether the file is being read or written, and the 
like. Users don’t need to know the details, because part of the standard I/O 
definitions obtained by including stdioM is a structure definition called FILE. 
The only declaration needed for a file pointer is exemplified by 

FILE *fp, *fopen(); 

This says that f p is a pointer to a FILE, and f open () returns a pointer to a 
FILE. FILE is a type name, like int, not a stmcture tag. 

The actual call to f open () in a program has the form: 

fp = fopen<name/ mode); 

The first argument of f open () is the name of the file, as a character string. The 
second argument is the mode, also as a character string, which indicates how you 
intend to use the file. The allowable modes are read ("r"), write ("w"), or 
append (a). 

In addition, each mode may be followed by a + sign to open the file for reading 
and writing. r+ positions the stream at the beginning of the file, w+ creates or 
tmncates the file, and a+ positions the stream to the end of the file. Both 


microsystems 


Revision A of 9 May 1988 



Chapter 2 


SunOS Programming 


13 


reads and writes may be used on lead/write streams, with the limitation that an 

f seek (), rewind (), or reading end-of-file must be used between a read and a 
write or vice versa. 

If a file that you open for writing or appending does not exist, it is created (if pos¬ 
sible). Opening an existing file for writing discairis the old contents. Trying to 
read a file that does not exist is an error, and there may be other causes of error as 
well (like trying to read a file when you don’t have permission). If there is any 
error, f open ( ) returns the null pointer value NULL — defined as zero in 
stdio.h. 

The next thing needed is a way to read or write the file once it is open. There are 
several possibilities, of which get c“( ) and put c ( ) are the simplest, getc () 
returns the next character from a file; it needs the file pointer to tell it what file. 
Thus 

c = getc(fp) 

places in c the next character from the file referred to by f p; it retunis EOF when 
it reaches end of file, put c () is the inverse of getc (): 

putc(c, fp) 

puts the character c on the file fp and returns c as its value, getc () and 
pu t c () return EOF on error. 

When a program is started, three streams are opened automatically, and file 
pointers are provided for them. These streams are the standaid input, the stan¬ 
dard output, and the standard error output; the corresponding file pointers are 
called stdin, stdout, and stderr. Normally these are all connected to the 
terminal, but may be redirected to files or pipes as described in Section 2.2. 
stdin, stdout and stderr are predefined in the I/O libraiy as the standard 
input, output and error files; they may be used anywhere an object of type 
FILE * can be. They are constants, however, not variables, so don’t tiy to 
assign to them. 

With some of the preliminaries out of the way, we can now write wc. 

The basic design is one that has been found convenient for many programs; if 
there are command-line arguments, they are processed in order. If there are no 
arguments, the standard input is processed. This way the program can be used 
standalone or as part of a larger process. 


microsystems 


Revision A of 9 May 1988 



14 Programming Utilities and Libraries 


-^ ^ 

#include <stdio.h> 

main(argc, argv) /* wc: count lines^ words, chars */ 
int argc; 
char *argv[ ]; 

{ 

int c, if inword; 

FILE *fp, *f open Ob¬ 
long linect, wordct, charct; 
long tlinect - 0, twordct = 0, tcharct = 0; 

i = 1; 
fp = stdin; 
do { 

if (argc > 1 && (fp=fopen (argv [i], ”r") ) === NULL) { 
fprintf(stderr, ”wc: can't open %s\n”, argv[i]); 
continue; 

} 

linect = wordct = charct = inword = 0; 
while ((c = getc(fp)) !- EOF) { 
charct++; 
if (c == '\n') 
linect++; 

if (c == ' ' II c == '\t' II c == '\n') 
dnword =0; 

else if (inword == 0) { 

inword = 1; 
wordct++; 

} 

} 

printf(”%71d %71d %71d”, linect, wordct, charct); 

printf(argc > 1 ? ” %s\n” : "\n”, argv[i]); 

fclose(fp); 

tlinect += linect; 

twordct += wordct; 

tcharct += charct; 

} while (++i < argc); 
if (argc > 2) 

printf(”%71d %71d %71d total\n”, tlinect, twordct, tcharct); 
exit (0) ; 

) 
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I ) 

The function fprintf () is identical to printf (), save that the first argu¬ 
ment is a file pointer that specifies the file to be written. 

The function f close () is the inverse of f open (); it breaks the connection 
between the file pointer and the external name that was established by f open (), 
freeing the file pointer for another file. Since there is a limit on the number of 
files that a program may have open simultaneously, it’s a good idea to free things 
when they are no longer needed. There is another reason to call f close () on 
an output file it flushes the buffer in which put c () is collecting output. 

fcloseO is called automatically for each open file when a program terminates 
normally. 


Error Handling Stderr and stderr is assigned to a program in the same way that stdin and stdout are. 

Output written on stderr appears on the user’s terminal even if the standard 
output is redirected, unless the standard error is also redirected, wc writes its 
diagnostics on stderr instead of stdout so that if one of the files can’t be 
accessed for some reason, the message finds its way to the user’s terminal instead 
of disappearing down a pipeline or into an output file. 

The argument of exit () is made available to whatever process called the pro¬ 
cess that is exiting (see Section 2.5.3, so the success or failure of the program can 
be tested by another program that uses this one as a subprocess. By convention, 
a return value of 0 signals that all is well; nonzero values signal abnormal situa¬ 
tions. 

exit () itself calls f close () for each open output file, to flush out any buf¬ 
fered output, then calls a routine named _exit (). The function _exit () ter¬ 
minates the program immediately without any buffer flushing; it may be called 
directly if desired. 


Miscellaneous I/O Functions 



The standard I/O library provides several other I/O functions besides those illus¬ 
trated above. 

Normally output with put c (), and such is buffered — use f flush (fp) to 
force it out immediately. 

f scanf () is identical to scanf (), except that its first argument is a file 
pointer (as with fprintf ()) that specifies the file from which the input comes; 
it returns EOF at end of file. 

The functions sscanf () and sprintf {) are identical to f scanf () and 
fprintf (), except that the first argument names a character string instead of a 
file pointer. The conversion is done from the string for sscanf () and into it 
for sprintf (), and no input or output is done. 

fgets(buf, size, fp) copies the next line from fp, up to and including a 
newline, into buf; at most size-1 characters are copied; it returns NULL at 
end of file, f put s (buf, fp) writes the string in buf onto file fp. 

The function unget c (c, f p) ‘pushes back’ the character c onto the input 
stream fp; a subsequent call to getc (), f scanf (), etc., will encounter c. 
Only one character of pushback per file is permitted. 
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2.4. Low-Level I/O 
Functions 


File Descriptors 


readO and write () 


This section describes the bottom level of I/O on the SunOS system. The lowest 
level of I/O in SunOS provides no buffering or any other services; it is in fact a 
direct entry into the operating system. You are entirely on your own, but on the 
other hand, you have the most control over what happens. And since the calls 
and usage are quite simple, this isn’t as bad as it sounds. 

In the SunOS operating system, all input and output is done by reading or writing 
files, because all peripheral devices, even the user’s terminal, are files in the file 
system. This means that a single, homogeneous interface handles all communi¬ 
cation between a program and peripheral devices. 

In the most general case, before reading or writing a file, it is necessary to inform 
the system of your intent to do so, a process called ‘opening’ the file. If you are 
going to write on a file, it may also be necessary to create it. The system checks 
your right to do so — does the file exist? Do you have permission to access it? 

— if all is well, returns a small positive integer called a file descriptor. When¬ 
ever I/O is to be done on the file, the file descriptor is used instead of die name, to 
identify the file. This is roughly analogous to the use of READ (5,. ..) and 
WRITE (6, .. .) in FORTRAN. All information about an open file is main¬ 
tained by the system; the user program refers to the file only by the file descrip¬ 
tor. 

The file pointers discussed in Section 2.3 are similar in spirit to file descriptors, 
but file descriptors are more fundamental. A file pointer is a pointer to a struc¬ 
ture that contains, among other things, the file descriptor for Ae file in question. 

Since input and output involving the user’s terminal are so common, special 
arrangements exist to make this convenient When the command interpreter (the 
‘shell’) runs a program, it opens three files, with file descriptors 0,1, and 2, 
called standard input, standard output, and standard error output All of these are 
normally cormected to the terminal, so if a program reads file descriptor 0 and 
writes file descriptors 1 and 2, it can do terminal I/O without opening the files. 

If I/O is redirected to and from files with < and >, as in 
, — ________ 

tutorial% prog < Infile > outfile 

— __ __ > 

the shell changes the default assignments for file descriptors 0 and 1 from the ter¬ 
minal to the named files. Similar observations hold if the input or output is asso¬ 
ciated with a pipe. Normally file descriptor 2 remains attached to the terminal, 
so error messages can go there. In all cases, the file assignments are changed by 
the shell, not by the program. The program does not need to know where its 
input comes from nor where its output goes, so long as it uses file 0 for input and 
1 and 2 for output. 

All input and output is done by two functions called read () and write (). 

For both, the first argument is a file descriptor. The second argument is a buffer 
in your program where the data is to come from or go to. The third argument is 
the number of bytes to be transferred. The calls are below: 
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/ ____ _ ______ 

> 

n_read = read(fd, buf, n); 

n_written = write(fd, buf, n); 

_ 

Each call returns a byte count which is the number of bytes actually transferred. 
On reading, the number of bytes returned may be less than the number asked for, 
because fewer than n bytes remained to be read. When the file is a terminal, 
read () normally reads only up to the next newline, which is generally less than 
what was requested. A return value of zero bytes implies end of file, and -1 
indicates an error of some sort For writing, the returned value is the number of 
bytes actually written; it is generally an error if this isn’t equal to the number 
supposed to be written. 

The number of bytes to be read or written is quite arbitrary. The two most com¬ 
mon values are 1, which means one character at a time (‘unbuffered’), and 1024, 
corresponding to a physical blocksize on many peripheral devices. This latter 
size will be most efficient, but even character-at-a-time I/O is not inordinately 
expensive. 

Putting these facts together, we can write a simple program to copy its input to 
its output. This program will copy anything to anything, since the input and out¬ 
put can be redirected to any file or device. 


♦define BUFSIZE 1024 

mainO /* copy input to output */ 

{ 

char buf[BUFSIZE]; 
int n; 

while ((n = read(0, buf, BUFSIZE)) > 0) 
write(1, buf, n); 
exit(0); 

} 


If the file size is not a multiple of BUFSIZE, some read () will return a smaller 
number of bytes, and the next call to read () after that will return zero. 

It is iirstructive to see how read () and write () can be used to construct 
higher-level routines like get char (), put char (), etc. For example, here is 
a version of getchar () which does unbuffered input 


♦define CMASK 

getchar() /* 

{ 

char c; 

return((read(0, &c, 1) > 0) ? c & CMASK : EOF); 

} 

--- - — --- ; 


0377 /* for making char's > 0 */ 

unbuffered single character input */ 
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c must be declared char, because read () accepts a character pointer. The 
character being returned must be masked with 0377 to ensure that it is positive; 
otherwise sign extension may make it negative. The constant 0377 is appropri¬ 
ate for the Sun but not necessarily for other machines. 

The second version of get char () does input in big chunks, and hands out the 
characters one at a time: 

-N 

tdefine CMASK 0377 /* for making char's > 0 */ 

#define BUFSIZE 1024 

getcharO /* buffered version */ 

{ 

static char buf[BUFSIZE]; 
static char *bufp = buf; 
static int n = 0; 

if (n == 0) { /* buffer is empty */ 

n = read(0, buf, BUFSIZE); 
bufp = buf; 

} 

return((—n >= 0) ? *bufp++ & CMASK : EOF); 

} 

^ _ ; 


open (), creat (), Other than the default standard input, output and error files, you must explicitly 

close (), and unlink () open files in order to read or write them. There are two system entry points for 

this, open () and creat (). 


open () is rather like the f open () discussed in the previous section, except 
that instead of returning a file pointer, it returns a file descriptor, which is just an 
int. 


/-- '— 


int fd; 


fd = open(name, rwmode); 



j 


As with f open (), die name argument is a character string corresponding to the 
external file name. The access mode argument is different, however: rwmode is 
0 for read, 1 for write, and 2 for read and write access, open () returns -1 if 
any error occurs; otherwise it returns a valid file descriptor. 

It is an error to try to open () a file that does not exist. The entry point 
creat () is provided to create new files, or to rewrite old ones. 

fd = creat(name, pmode); 

returns a file descriptor if it could create the file called name, and -1 if not If 
the file already exists, creat () will tmncate it to zero length; it is not an error 
to creat () a file that already exists. 


microsystems 


Revision A of 9 May 1988 






Chapter 2 — SunOS Programming 19 


If the file is brand new, creat () creates it with the protection mode specified 
by the pmode argument. In the SunOS file system, there are nine bits of protec¬ 
tion information associated with a file, controlling read, write and execute per¬ 
mission for the owner of the file, for the owner’s group, and for all others. Thus 
a three-digit octal number is most convenient for specifying the permissions. For 
example, 0755 specifies read, write and execute permission for the owner, and 
read and execute permission for the group and everyone else. 

To illustrate, here is a simplified version of the SunOS utility cp, & program 
which copies one file to another. The main simplification is that our version 
copies only one file, and does not permit the second argument to be a directory: 

f ---^---^ 

♦define NOLL 0 
♦define BUFSIZE 1024 

#define PMODE 0644 /* RW for owner, R for group, others */ 

main(argc, argv) /* cp: copy fl to f2 */ 
int argc; 
char *argv[ ]; 

{ 

int fl, f2, n; 
char buf [BUFSIZE] ; 

if (argc != 3) 

error("Usage: cp from to", NULL); 
if ((f1 = open(argv[l], 0)) == -1) 

error("cp; can't open %s", argv[l]); 
if ( (f2 =« creat (argv [2] , PMODE)) == -1) 

error("cp: can't create %s", argv[2]); 

while ((n = read(fl, buf, BUFSIZE)) >0) 
if (write(f2, buf, n) 1= n) 

error("cp: write error", NULL); 

exit(0); 

} 

error(si, s2) /* print error message and die */ 
char *sl, *s2; 

{ 

printf (sl, s2) ; 
printf ("\n"); 
exit (1); 

} 

--—___> 

As we said earlier, there is a limit (typically 20-32) on the number of files which 
a program may have open simultaneously. Accordingly, any program which 
intends to process many files must be prepared to reuse file descriptors. The rou¬ 
tine close () breaks the connection between a file descriptor and an open file, 
and frees the file descriptor for use with some other file. Termination of a pro¬ 
gram via exit () or return from the main program closes all open files. 


The function unlink (filename) removes the file filename from the file 
system. 
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Random Access 
and Iseek 


Error Processing 


seek () File I/O is nomially sequential: each read () or write () takes place at a 

position in the file right after the previous one. When necessary, however, a file 
can be read or written in any arbitrary order. The system call Iseek () provides 
a way to move around in a file without actually reading or writing: 

Iseek(fd, offset, origin); 

forces the current position in the file whose descriptor is f d to move to position 
offset, which is taken relative to the location specified by origin. Subse¬ 
quent reading or writing will begin at that position, offset is a long; f d and 
origin are int’s. origin can be 0,1, or 2 to specify that offset is to be 
measured from the beginning, from the current position, or from the end of the 
file, respectively. For example, to append to a file, seek to the end before writ¬ 
ing: 

Iseek(fd, OL, 2); 

To get back to die beginning (‘rewind’), 

Iseek(fd, OL, 0); 

Notice the OL argument; it could also be written as (long) 0. 

With Iseek (), it is possible to treat files more or less like large arrays, at flie 
price of slower access. For example, the following simple function reads any 
number of bytes from any arbitrary place in a file. 

—— > 

get(fd, pos, buf, n) /* read n bytes from position pos */ 
int fd, n; 
long pos; 
char *buf; 

( 

Iseek(fd, pos, 0); /* get to pos */ 

return(read(fd, buf, n)); 

) 

k__ 


The routines discussed in this section, and in fact all the routines which are direct 
entries into the system can incur errors. Usually they indicate an error by return¬ 
ing a value of-1. Sometimes it is nice to know what sort of error occurred; for 
this purpose all these routines, when appropriate, leave an error number in the 
external variable errno. The meanings of the various error numbers are listed 
in intro (2) in the SunOS Reference Manual so your program can, for example, 
determine if an attempt to open a file failed because it did not exist or because 
the user lacked permission to read it. Perhaps more commonly, you may want to 
display the reason for failure. The routine perror () displays a message asso¬ 
ciated with the value of errno; more generally, sys_errno is an array of 
character strings which can be indexed by errno and displayed by your pro¬ 
gram. 
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2.5. Processes 

The system () Function 


It is often easier to use a program written by someone else than to invent one’s 
own. This section describes how to execute a program trom within another. 

The easiest way to execute a program from another is to use the standard library 
routine system (). syst em () takes one argument, a command string exactly 
as typed at the terminal (except for the newline at the end) and executes it. For 
instance, to timestamp the output of a program. 


main( ) { 

system("date"); /* rest of processing */ 

} 


If the command string has to be built from pieces, the in-memory formatting 
capabilities of sprintf () may be useful. 

Remember that getc () and put c () normally buffer their input; terminal I/O 
will not be properly synchronized unless this buffering is defeated. For output, 
use f flush (); for input, see setbuf () in section 2.7. 


If you’re not using the standard library, or if you need finer control over what 
happens, you will have to constract calls to other programs using the more primi¬ 
tive routines that the standard library’s system () routine is based on^. 

The most basic operation is to execute another program without returning, by 
using the routine execl (). To display the date as the last action of a r unni ng 
program, use 

execl("/bin/date", "date", NULL); 

The first argument to execl () is the filename of the command; you have to 
know where it is found in the file system. The second argument is convention¬ 
ally the program name (that is, the last component of the file name), but this is 
seldom used except as a placeholder. If the command takes arguments, they are 
strung out after this; the end of the list is marked by a NULL argument. 

The execl () call overlays the existing program with the new one, runs that, 
then exits. There is no return to the original program. 

More realistically, a program might fall into two or more phases that communi¬ 
cate only through temporaiy files. Here it is natural to start the second pass sim¬ 
ply by an execl () call from the first. 

The one exception to the mle that the original program never gets control back 
occurs when there is an error, for example if the file can’t be found or is not exe¬ 
cutable. If you don’t know where date is located, you might try the following 
calls. 


Low-Level Process Creation 

— execl () andexecvO 


* system () uses /bin/sh (the Bourne shell) to execute the command string, so syntax specific to the C 
shell will not work. 
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Control of Processes — 
fork () and wait () 


- \ 

execl ("/bin/date", "date*', NULL); 
execl("/usr/bin/date", "date", NULL); 
fprintf(stderr, "Someone stole 'date*\n"); 

< _ ^ 

A variant of execl () called execv () is useful when you don’t know in 
advance how many arguments there ate going to be. The call is 

execv(filename, argp); 

where argp is an array of pointers to the arguments; the last pointer in the array 
must be NULL so execv () can tell where the list ends. As with execl (), 
filename is the file in which the program is found, and argp [ 0 ] is the name 
of the program. (This arrangement is identical to the argv array for program 
arguments.) 

Neither of these routines provides the niceties of normal command execution. 
There is no automatic search of multiple directories — you have to know pre¬ 
cisely where the command is located. Nor do you get the expansion of metachar¬ 
acters like <, >, *, ?, and [ ] in the argument list. If you want these, use 
execl () to invoke the shell sh, which then does all the work. Construct a 
string commandline that contains the complete command as it would have 
been typed at the terminal, then say 

execl("/bin/sh", "sh", "-c", commandline, NULL); 

The shell is assumed to be at a fixed place, /bin/sh. Its argument -c says to treat 
the next argument as a whole command line, so it does just what you want. The 
only problem is in constracting the right information in commandline. 


So far what we’ve talked about isn’t really all that useful by itself. Now we will 
show how to regain control after running a program with execl () or 
execv (). Since these routines simply overlay the new program on the old one, 
to save the old one requires that it first be split into two copies; one of these can 
be overlaid, while the other waits for the new, overlaying program to finish. The 
splitting is done by a routine called fork (): 

proc_id = fork( ) ; 

splits the program into two copies, both of which continue to mn. The only 
difference between the two is the value of proc_id, the ‘process id.’ In one of 
these processes (the ‘child’), proc_id is zero. In the other (the ‘parent’), 
proc_id is nonzero; it is the process number of the child. Thus the basic way 
to call, and return from, another program is 


if (forkO == 0) 

execl("/bin/sh", "sh", "-c", cmd, NULL); /* in child */ 

--- V 

And in fact, except for handling errors, this is sufficient. The fork () makes 
two copies of the program. In the child, the value returned by fork () is zero, 
so it calls execl 0 which does the command and then dies. In the parent, 
fork () returns nonzero so it skips the execl (). If there is any error, 
fork 0 returns-1. 
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More often, the parent wants to wait for the child to terminate before continuing 
itself. This can be done with the function wait (): 


/ -- - — - — 

int status; 


if (fork( ) == 0) 


execl(...); 


wait (Sstatus); 


V 

--- ^ 


This still doesn’t handle any abnormal conditions, such as a failure of the 
execl () or fork (), or the possibility that there might be more than one child 
running simultaneously. The wait () returns the process id of the terminated 
child, if you want to check it against the value returned by fork (). Finally, 
this fragment doesn’t deal with any fuimy behavior on the part of the child 
(which is reported in status). Still, these three lines are the heart of the stan¬ 
dard library’s system () routine, which we’ll show in a moment. 

The status returned by wait () encodes in its low-order eight bits the 
system s idea of the child’s termination status; it is 0 for normal termination and 
nonzero to indicate various kinds of problems. The next higher eight bits are 
taken from the argument of the call to exit () which caused a normal termina¬ 
tion of the child process. It is good coding practice for all programs to return 
meaningful status. 

When a program is called by the shell, the three file descriptors 0, 1, and 2 are set 
up to point at the right files (see Section 2.4.1), and all other possible file descrip¬ 
tors are available for use. When this program calls another one, correct etiquette 
suggests making sure the same conditions hold. Neither fork () nor the 
exec () calls affects open files in any way. If the parent is buffering output that 
must come out before output from the child, the parent must flush its buffers 
before the execl (). Conversely, if a caller buffers an input stream, the called 
program will lose any information that has been read by the caller. 

A pipe is an I/O channel intended for use between two cooperating processes: 
one process writes into the pipe, while the other process reads from the pipe. The 
system looks after buffering the data and synchronizing the two processes. Most 
pipes are created by the shell, as in 


tutorial% Is | pr 


which connects the standard output of Is to the standard input of pr. Some¬ 
times, however, it is most convenient for a process to set up its own plumbing; in 
this section, we illustrate how the pipe connection is established and used. 
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The system call pipe () creates a pipe. Since a pipe is used for both reading 
and writing, two file descriptors are returned; the actual usage is like this: 


r 

-\ 

int fd[2]; 


stat = pipe(fd); 


if (stat == -1) 


/* there was an error ... */ 


_ 



f d is an array of two file descriptors, where f d [ 0 ] is the read side of the pipe 
and f d [ 1 ] is for writing. These may be used in read (), write () and 
close () calls just like any other file descriptors. 

If a process reads a pipe which is empty, it waits until data arrives; if a process 
writes into a pipe which is too full, it waits until the pipe empties somewhat. If 
the write side of the pipe is closed, a subsequent read () will encounter end of 
file. 

To illustrate the use of pipes in a realistic setting, let us write a function called 
popen (cmd, mode), which creates a process cmd (just as system () does), 
and returns a file descriptor that will either read or write that process, according 
to mode. That is, the call 

fout = popen("pr", WRITE); 

creates a process that executes the pr command; subsequent write () calls 
using the file descriptor fout will send their data to that process flirough the 
pipe. 

popen () first creates the pipe with a pipe () system call; it then fork ()’s to 
create two copies of itself. Tire child decides whether it is supposed to read or 
write, closes the other side of the pipe, then calls the shell (via execl ()) to run 
the desired process. The parent likewise closes the end of the pipe it does not 
use. These closes are necessary to make end-of-file tests work properly. For 
example, if a child that intends to read fails to close the write end of the pipe, it 
will never see the end of the pipe file, just because there is one writer potentially 
active. 
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#include <stdio.h> 

#define READ 0 
#define WRITE 1 

tdefine tst(a, b) (mode « READ ? (b) : (a)) 
static int popen__pid; 

popen(cmd, mode) 
char *cmd; 
int mode ; 

{ 

int p[2]; 

if (pipe(p) < 0) 
return(NULL); 

if ( (popen_j>id == fork ( )) == 0) { 
close (t St (p [WRITE] , p[READ])); 
close(tst(0, 1)); 
dup(tst(p[READ], p[WRITE])); 
close (tSt(p[READ], p[WRITE])); 
execl CVbin/sh", "sh", "-c", cmd, 0); 

_exit(l); /* disaster has occurred if we get here */ 

if (popen__pid =*= -I) 
return(NULL); 

close(tst(p[READ], p[WRITE])); 
return(tst (p[WRITE], p[READ])); 


The sequence of close ()’s in the child is a bit tricky. Suppose that the task is 
to create a child process that will read data from the parent. Then the first 
close () closes the write side of the pipe, leaving the read side open. The lines 

f -------> 

close(tst(0, 1)); 

dup(tst(p[READ], p[WRITE])); 

-----J 

are the conventional way to associate the pipe descriptor with the standard input 
of the child. The close () closes file descriptor 0, that is, the standard input, 
dup 0 is a system call that returns a duplicate of an already open file descriptor. 
File descriptors are assigned in increasing order and the first available one is 
returned, so the effect of the dup () is to copy the file descriptor for the pipe 
(read side) to file descriptor 0; thus the read side of the pipe becomes the standard 
input^. Finally, the old read side of the pipe is closed. 

A similar sequence of operations takes place when die child process is supposed 
to write to the parent instead of reading. You may find it a useful exercise to step 
through that case. 


^ Yes, this is a bit tricky, but it's a standard idiom. 
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The job is not quite done, for we still need a function pclose () to close the 
pipe created by popen (). The main reason for using a separate function rather 
than close () is that it is desirable to wait for the termination of the child pro¬ 
cess. First, the return value from pclose () indicates whether the process suc¬ 
ceeded. Equally important when a process creates several children is that only a 
bounded number of unwaited-for children can exist, even if some of them have 
terminated; performing the wait () lays the child to rest. Thus: 

---- , 

#include <signal.h> 

pclose(fd) /* close pipe fd */ 
int fd; 

{ 

register r, (*hstat) { ), (*istat) ( ), (*qstat) ( ); 

int status; 

extern int popen^id; 

close(fd); 

istat = signal(SIGINT, SIG_IGN); 
qstat = signal (SIGQUIT, SIG__IGN) ; 
hstat = signal(SIGHUP, SIG_IGN); 

while ((r *= wait (&status)) !*= popen_pid && r -1); 

if (r == -1) 

status ^ -1; 
signal(SIGINT, istat); 
signal(SIGQUIT, qstat); 
signal(SIGHUP, hstat); 
return(status); 

} 

s___/ 


The calls to signal () make sure that no interrupts, etc. interfere with the wait¬ 
ing process; this is the topic of the next section. 

The routine as written has the limitation that only one pipe may be open at once, 
because of the single shared variable popen_j)id; it really should be an array 
indexed by file descriptor. A popen () function, with slightly different argu¬ 
ments and return value is available as part of the standard I/O library discussed 
below. As currently written, it shares the same limitation. 

2.6. Signals — Interrupts This section is concerned with how to deal gracefully with signals from the out- 

and All That side world (like interrupts), and with program faults. Since there’s nothing very 

useful that can be done from within C about program faults, which arise mainly 
from illegal memory references or from execution of peculiar instructions, we’ll 
discuss only the outside world signals: interrupt and quit, which are generated 
from the keyboard^, hangup, caused by hanging up the phone on dialup lines, 
and terminate, generated by the kill command. When one of these events occurs, 
the signal is sent to all processes which were started from the corresponding ter¬ 
minal — the signal terminates the process unless other arrangements have been 
made. In the quit case, a core image file is written for debugging purposes. 


^ The current binding of characters and signals can be discovered bythestty all conunand. On Sun 
systems, typing control-C usually generates the kill signal and controls generates the quit signal. 
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signal () is the routine which alters the default action, signal () has two 
arguments, the first specifies the signal to be processed, and the second^ argument 
specifies what to do with that signal. The first argument is just a numeric code, 
but the second is either a function, or a somewhat strange code that requests that 
the signal either be ignored or that it be given the default action. The include file 
signal.h gives names for the various arguments, and should always be included 
when signals are used. Thus 


/-^------- 


tinclude <signal.h> 


signal(SIGINT, SIG_IGN); 





means that interrupts are ignored, while 
signal(SIGINT, SIG_DFL); 

restores the default action of process termination. In all cases, signal () 
returns the previous value of the signal. The second argument to signal () 
may instead be the name of a function (which has to be declared explicitly if the 
compiler hasn’t seen it already). In this case, the named routine will be called 
when the signal occurs. Most commonly this facility is used so that the program 
can clean up unfinished business before terminating, for example to delete a tem¬ 
porary file: 


#include <signal.h> 

main( ) 

{ 

int onintr( ); 

if (signal(SIGINT, SIG_IGN) != SIG_IGN) 
signal(SIGINT, onintr); 

/* Process ... */ 

exit(0); 

} 

onintr( ) 

( 

unlink(tempfile) ; 
exit(1); 

} 


Why the test and the double call to signal () ? Recall that signals like inter¬ 
rupt are sent to all processes started from a particular terminal. Accordingly, 
when a program is to be run non-interactively (started by &), the shell turns off 
interrupts for it so it won’t be stopped by interrupts intended for foreground 
processes. If this program began by announcing that all intenupts were to be 
sent to the onintr () routine regardless, that would imdo the shell’s effort to 
protect it when mn in the background. 
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The solution, shown above, is to test the state of interrupt handling, and to con¬ 
tinue to ignore interrupts if they are already being ignored. The code as written 
depends on the fact that signal () returns the previous state of a particular sig¬ 
nal. If signals were already being ignored, the process should continue to ignore 
them; otherwise, they should be caught. 

A more sophisticated program may wish to intercept an interrupt and interpret it 
as a request to stop what it is doing and return to its own command processing 
loop. Think of a text editor: interrupting a long display should not terminate the 
edit session and lose the work already done. The outline of the code for this case 
is probably best written like this: 

- , 

#include <signal.h> 

#include <setjmp.h> 
jmp_buf sjbuf; 

main ( ) 

{ 

int (*istat)( ), onintr( ); 

istat = signal (SIGINT, SIG_IGN) ; original status */ 

setjmp(sjbuf); /* save current stack position */ 

if (istat !=* SIG_IGN) 

signal(SIGINT, onintr); 

/* main processing loop */ 

} 

onintr( ) 

{ 

printf("\nInterrupt\n"); 

long jmp(sjbuf); /* return to saved state */ 

} 

< _ > 


The include file setjmp.h declares the type jmp_buf — an object in which the 
state can be saved, s jbuf is such an object. The set jmp () routine then saves 
the state of things. When an interrupt occurs the onintr () routine is called, 
which can display a message, set flags, or whatever, long jmp () takes as argu¬ 
ment an object set by set jmp () , and restores control to the location following 
the call to set jmp () , so control (and the stack level) will pop back to the place 
in the main routine where the signal is set up and the main loop entered. Notice, 
by the way, that the signal gets set again after an interrupt occurs. This is neces¬ 
sary; most signals are automatically reset to their default action when they occur. 

Some programs that want to detect signals simply can’t be stopped at an arbitrary 
point, for example in the middle of updating a linked list If the routine called 
when a signal occurs sets a flag and then returns instead of calling exit () or 
long jmp ( ), execution continues at the exact point it was interrupted. The 
interrupt flag can then be tested later. 

There is one difficulty associated with this approach. Suppose the program is 
reading the terminal when the interrupt is sent. The specified routine is duly 
called; it sets its flag and returns. If it were really true, as we said above, that 
‘execution resumes at the exact point it was interrupted,’ the program would 
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continue reading the terminal until the user typed another line. This behavior 
might well be confusing, since the user might not know that the program is read¬ 
ing; he presumably would prefer to have the signal take effect instantly. The 
method chosen to resolve this difficulty is to terminate the terminal read when 
execution resumes after the signal, returning an error code which indicates what 
happened. 

Thus programs which catch and resume execution after signals should be 
prepared for ‘errors’ which are caused by interrupted system calls. The ones to 
watch out for are reads from a terminal, wait (), and pairs© (). A program 
whose onintr () routine just sets intf lag, resets the interrupt signal, and 
returns, should usually include code like the following when it reads the standard 
input: 


( -------—-- 

A 

if (getcharO == EOF) 


if (intflag) 


/* EOF caused by interrupt */ 


else 


/* true end-of-file */ 


V 



A final subtlety to keep in mind becomes important when catching signals is 
combined with executing other programs. Suppose a program catches interrupts, 
and also includes a method (like *!’ in the editor) whereby other programs can be 
executed. Then the code should look something like this: 


if (fork( ) == 0) 
execl 

signal( SIGINT^ SIG^IGN) ; /* ignore interrupts */ 

wait (&status); /* until the child is done */ 

signal( SIGINT, onintr); /* restore interrupts */ 


Why is this? Again, it’s not obvious, but not really difficult Suppose the pro¬ 
gram you call catches its own interrupts. If you interrupt the subprogram, it will 
get the signal and return to its main loop, and probably read your terminal. But 
the calling program will also pop out of its wait for the subprogram and read your 
terminal. Having two processes reading your terminal is very unfortunate, since 
the system figuratively flips a coin to decide who should get each line of input 
A simple way out is to have the parent program ignore interrupts until the child is 
done. This reasoning is reflected in the standard I/O library function 
shownsystem () as 
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r -> 

finclude <signal.h> 

system(s) /* run command string s */ 

char *s; 

{ 

int status, pid, w; 

register int (*istat)( ), (*qstat)( ); 

if ((pid = fork( )) == 0) { 

execl(”/bin/sh”, ”sh”, "-c”, s, 0); 

_exit(127); 

} 

istat - signal(SIGINT, SIG_IGN); 

qstat = signal(SIGQUIT, SIG_IGN); 

while ((w = wait(^status)) != pid && w != -1) 

/ 

if (w == -1) 

status = -1; 
signal(SIGINT, istat); 
signal(SIGQUIT, qstat); 
return(status); 



As an aside on declarations, the function signal () obviously has a rather 
strange second argument. It is in fact a pointer to a function delivering an in¬ 
teger, and this is also the type of the signal routine itself. The two values 
SIG_IGN and SIG_DFL have the right type, but are chosen so they coincide 
with no possible actual functions. For the enthusiast, here is how they are 
defined for the Sun system — the definitions should be sufficiently ugly and non¬ 
portable to encourage use of the include file. 

♦define SIG_DFL (int (*)())0 

♦define SIG IGN (int (*)())! 


2.7. The Standard I/O 
Library 


The standard I/O library was designed with the following goals in mind: 

1. It must be as efficient as possible, both in time and in space, so that there 
will be no hesitation in using it, no matter how critical the application. 

2. It must be simple to use, and also free of the magic numbers and mysterious 
calls whose use mars the understandability and portability of many programs 
using older packages. 

3. The interface provided should be applicable on all machines, whether or not 
the programs which implement it are directly portable to other systems, or to 
machines non-Sun running a version of SunOS. 
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General Usage 


Standard I/O Library Calls 


Each program using the library must have the line 
finclude <stdio.h> 


which defines certain macros and variables. The routines are in the normal C 
hbrary, so no special library argument is needed for loading. All names in the 
include file intended only for internal use begin with an underscore to reduce 
the possibility of collision with a user name. The names intended hTbe visible 
outside the package are 

St din the name of the standard input stream 
stdout the name of the standard output stream 
s t de r r the name of the standard error stream 

EOF is actually -1, and is the value returned by the read routines on end- 

of-file or error 

NULL is a notation for the null pointer, returned by pointer-valued func¬ 
tions to indicate an error 


FILE expands to struct _iob and is a useful shorthand when declar¬ 
ing pointers to streams 

BUFSIZ is a number (viz. 1024) of the size suitable for an I/O buffer supplied 
by the user. See setbuf (), below 

getcO, getchar, putc(), putchar, feof, terror, fileno 
are defined as macros. Their actions are described below; they are 
mentioned here to point out that it is not possible to redeclare them 
and that they are not actually functions; thus, for example, they may 
not have breakpoints set on them. 

The routines in this package offer the convenience of automatic buffer allocation 
and output flushing where appropriate. The names stdin (), stdout (), and 
St derr () are constants and may not be assigned to. 


FILE *fopen(filename, type) 
char *filename; 
char *type; 

opens the file and, if needed, allocates a buffer for it. filename is a character 
string specifying the name, type is a character string (not a single character). It 
may be "r ", "w", or "a" to indicate intent to read, write, or append. In addi¬ 
tion, each mode may be followed by a -i- sign to open the file for reading and 
writing. r+ positions the stream at the beginning of the file, w-i- creates or tran- 
cates the file, and a+ positions the stream to the end of the file. Both reads and 
writes may be used on read/write streams, with the limitation that an f seek (), 
rewind (), or reading end-of-file must be used between a read and a write or 
vice versa. The value returned is a file pointer. If it is NULL the attempt to open 
failed. 
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f reopen () 


getcO 


fgetc () 


putc 0 


fputc () 


fclose () 


fflush() 


FILE *freopen(filename, type, ioptr) 
char *filename; 
char *type; 

FILE *ioptr; 

The stream named by ioptr is closed, if necessary, and then reopened as if by 
f open () . If the attempt to open fails, NULL is returned, otherwise ioptr is 
returned, which now refers to the new file. Often the reopened stream is stdin 
or stdout. The filename and type parameters are as for f open (). 

int get c (iopt r) 

FILE *ioptr; 

returns the next character from the stream named by ioptr, which is a pointer 
to a file such as returned by f open ( ), or the name stdin. The integer EOF is 
returned on end-of-file or when an error occurs. The null character \0 is a legal 
character. 

int fgetc(ioptr) 

FILE *ioptr; 

acts like getc () but is a genuine function, not a macro, so it can be pointed to, 
passed as an argument, etc. 

int putc(c, ioptr) 
int c; 

FILE *ioptr; 

putc () writes the character c on (he output stream named by ioptr, which is 
a value returned from f open () or perhaps stdout or stderr. The character 
is returned as value, and EOF is returned on error. 

int fputc(c, ioptr) 
int c; 

FILE *ioptr; 

acts like putc () but is a genuine function, not a macro. 

int fclose(ioptr) 

FILE *ioptr; 

The file corresponding to ioptr is closed after any buffers are emptied. A 
buffer allocated by the I/O system is freed, fclose () is automatic on normal 
termination of the program. 

int fflush(ioptr) 

FILE *ioptr; 

Any buffered information on the (output) stream named by ioptr is written out. 
Output files are normally buffeted if they ate not directed to the terminal. 
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exit () 


feof 0 


terror () 


getchar () 


putchar() 


fgets () 


puts 0 


fputs () 


(void) exit(errcode); 
int errcode; 

terminates the process and returns its argument as status to the parent. This is a 
special version of the routine which calls f flush () for each output file. To 
terminate without flushing, use _exit (). 

int feof(ioptr) 

FILE *ioptr; 

returns nonzero when end-of-file has occurred on the specified input stream. 

int ferror(ioptr) 

FILE *ioptr; 

returns nonzero when an error has occurred while reading or writing the named 
stream. The error indication lasts until the file has been closed. 

int getchar(); 
is identical to getc (stdin). 

int putchar(c); 
is identical to put c (c , stdout). 

char *fgets(s, n, ioptr) 
char *s/ 
int n; 

FILE *ioptr; 

reads to n-1 characters, or up to a newline character, whichever comes first, 
from the stream ioptr into the string pointed to by the character pointer s. A 
null character is placed after the last character read in the strings s. f get s 
returns the first argument, or NULL if error or end-of-file occurred. 

int puts(s) 
char *s; 

puts () copies the null-terminated strings specified by s onto the standard out¬ 
put stream and appends a newline character. 

int fputs(s, ioptr) 
char *s; 

FILE *ioptr; 

writes the null-terminated string (character array) s on the stream ioptr. No 
newline is appended. The last character transmitted is returned as value, or EOF 
is returned on error. 
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ungetc() 


printf() 


scanf () 


int ungetc(c, ioptr) 
int c ; 

FILE *ioptr; 

The argument character c is pushed back on the input stream named by ioptr. 
Only one character may be pushed back. 

int printf(formatf al, ...) 
char ^format; 

int fprintf(ioptr^ format, al, ...) 

FILE *ioptr; 
char ^format; 

int sprintf(s, format, al, ...) 
char *s; 
char *format; 

printf () writes on the standard output, fprintf () writes on the output 
stream named by ioptr. sprintf () puts characters in the character array 
(string) named by s. The specifications are as described in printf (3S). 

printf () and f print f () return the number of characters actually transmit¬ 
ted, or return EOF if any error condition exists on the output file, sprint f () 
returns a pointer to the buffer where the formatted string is placed. 

int scanf(format, al, ...) 
char ^format; 

int fscanf(ioptr, format, al, ...) 

FILE *ioptr; 
char *format; 

int sscanf(s, format, al, ...) 
char *s; 
char *format; 

scanf () reads from the standard input, f scanf () reads from the named 
input stream, sscanf () reads from the character string supplied as s. 
scanf () reads characters, interprets them according to the format, and stores 
the results in its arguments. Each routine expects as arguments a control string 
format, and a set of arguments, each of which must be a pointer, indicating 
where the converted input should be stored. 

scanf () returns as its value the number of successfully matched and assigned 
input items. This can be used to decide how many input items were found. On 
end of file, EOF is returned; note that this is different from 0, which means that 
the next input character does not match what was called for in the control string. 
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fread() 


fwrite 0 


rewind 0 


system() 


getw() 


putw() 


setbuf 0 


int fread(ptr, sizeof(*ptr), nitems, ioptr) 
unsigned nitems; 

FILE *ioptr/ 

reads nitems of data of the type of *ptr froin file ioptr into the ineniory 
area starting at ptr. No advance notification that binary I/O is being done is 
required, fread () returns the number of items actually read from the specified 
stream. 

int fwrite(ptr, sizeof(*ptr), nitems, ioptr) 
unsigned nitems; 

FILE *ioptr; 

Like fread (), but in the other direction, f write returns the number of items 
actually transmitted to the specified stream. This may possibly be less than the 
number of items requested if an error occurs while the transfer is in process. 

(void) rewind(ioptr) 

FILE *ioptr; 

rewinds the stream named by ioptr. It is not very useful except on input, since 
a rewound output file is still open only for output. 

int system(string) 
char *String; 

The string is executed by the shell as if typed at the terminal. The return 
value is the exit code of the invoked shell, which is usually the exit code of the 
last command executed by it. 

int getw(ioptr) 

FILE *ioptr; 

returns the next word from the input stream named by ioptr. EOF is returned 
on end-of-file or error, but since this a perfectly good integer, feof () and 
terror () should be used. A ‘word’ is 32 bits on the Sun Workstation. 

int putw(w, ioptr) 

FILE *ioptr; 

writes the integer w on the named output stream, putw () returns the current 
error status of the specified stream, as if an terror () call had b^n made. 

(void) setbuf(ioptr, buf) 

FILE *ioptr; char *buf; 

setbuf () may be used after a stream has been opened but before I/O has 
started. If buf is NULL, the stream is unbuffered. Otherwise the buffer supplied 
is used. It must be a character array of sufficient size: 

char buf[BUFSIZ]; 
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setbuffer () 


fileno () 


fseek 0 


ftellO 


getpw () 


malloc () 


free () 


(void) setbuffer(ioptr, buf, size) 

FILE *ioptr/ 
char *buf; 
int size; 

setbuffer () is like setbuf () (described above), but can be used when a 
specified, nonstandard buffer size should be used. 

int fileno(ioptr) 

FILE *ioptr; 

returns the integer file descriptor associated with the file. 

int fseek(ioptr, offset, ptrname) 

FILE *ioptr; 
long offset; 
int ptrname; 

The location of the next byte in die stream named by ioptr is adjusted, 
of f set is a long integer. If ptrname is 0, the offset is measured from the 
beginning of the file; if ptrname is 1, the offset is measured from the current 
read or write pointer; if ptrname is 2, the offset is measured from the end of the 
file. The routine accounts properly for any buffering. When this routine is used 
on non SunOS systems, the offset must be a value returned fiom f tell () and 
the ptrname must be 0. 

long ftell(ioptr) 

FILE *ioptr; 

The byte offset, measured from the begiiming of the file, associated with the 
named stream is returned. Any buffering is properly accoimted for. On non 
SunOS systems the value of this call is useful only for handing to f seek (), so 
as to position the file to the same place it was when ftell () was called. 

int getpw(uid, buf) 
int uid; 
char *buf; 

The password file is searched for the given integer user ID. If an appropriate line 
is found, it is copied into the character array buf, and 0 is returned. If no line is 
found corresponding to the user ID then 1 is returned. 

char *malloc(num) 
int num; 

allocates num bytes. The pointer returned is aligned so as to be usable for any 
purpose. NULL is returned if no space is available. 

int free(ptr) 
char *ptr; 

free () frees up memory previously allocated by malloc (). free () returns 
a 0 if any errors were detected (such as ptr being misaligned), and returns 1 
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calloc 0 


cfree () 


Character Type Checking 



otherwise. Disorder can be expected if the pointer was not obtained from mal- 
loc 0. 

char *calloc (nuzn^ size); 
unsigned num; 
unsigned size; 

allocates space for num items, each of size size. The space is guaranteed to be 
set to 0 and the pointer is aligned so as to be usable for any purpose. NULL is 
returned if no space is available. 

(void) cfree(ptr, num, size) 
char *ptr; 
unsigned num; 
unsigned size; 

Space is retun^d to the pool used by calloc (). Disorder can be expected if 
the pointer was not obtained from calloc (). 

The following are macros whose definitions may be obtained by including 
<ctype.h>. 

is alpha (c) returns nonzero if c is alphabetic, 
isupper (c) returns nonzero if c is upper-case alphabetic. 
islower(c) returns nonzero if c is lower-case alphabetic, 
isdigit (c) returns nonzero if c is a digit. 

i s xdigit (c) returns nonzero if c is a hexadecimal digit — that is, one of ‘0’ 
through ‘9’, ‘a’ through ‘f, or ‘A’ through ‘F’. 

isspace (c) returns nonzero if c is a spacing character: tab, newline, carriage 
return, vertical tab, form feed, space. 

ispunct (c) returns nonzero if c is any punctuation character, that is, not a 
space, letter, digit or control character. 

isalnum (c) returns nonzero if c is a letter or a digit. 

ispr int (c) returns nonzero if c is printable — a letter, digit, space, or punc¬ 
tuation character. 

i s cn t r 1 (c) returns nonzero if c is a control character. 

isascii (c) returns nonzero if c is an ASCII character, that is, less than octal 

0200. 

isgraph (c) returns nonzero if c is a printing character — like isprint (c) 
but doesn’t include flie space character. 
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Character Type Conversion t oupper (c) returns the upper-case character corresponding to the lower-case 

letter c. 

to lower (c) returns the lower-case character corresponding to the upper-case 
letter c. 
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System V Compatibility Features 


3.1. Introduction 



System V Enhancements in 
SunOS 4.0 


This overview is intended for both users and programmers who want to learn 
about System V compatibility features in SunOS 4.0. 

SunOS 4.0 offers Sun users nearly complete System V compatibility. Sun’s 
compatibility package allows programmers to write software that meets the Base 
Level of the System V Interface Definition (SVID). SunOS 4.0 represents yet 
another phase of joint efforts by AT&T and Sun to unify the different versions of 
the UNIX system. The two principal versions have been 4.2 BSD (now 4.3 
BSD),t and System V in its various releases. 

System V and 4.3 BSD are not radically different, either in the interface they 
present to the user, or the routines they provide for the programmer. They are 
derived from UNIX systems written by Ken Thompson and Dennis Ritchie in the 
mid-seventies, and many features are essentially unchanged since then. 

The System V compatibility package permits programmers to write and test 
software targeted for either System V or 4.x BSD. Users who acquire software 
that mns only on System V can mn it by means of the compatibility library. 
Commands, system calls, and library routines can be drawn concurrently from 
either the Berkeley or System V set. It is even possible to have one window that 
uses BSD by preference, and another window that uses System V by preference. 

SunOS 4.0 incorporates the full SVID Release 3 Base Level system, which 
reflects further progress on System V and BSD convergence. However, SunOS 
4.0 does not support mandatory record and file locking. New features include: 

□ System calls compatible with SVID Base Level system calls, including: 
chown (), creat (), f cntl (), kill (), mknod (), open (), and 
utime(). 

Q Complete System V STREAMS interface, to support portable communication 
protocol modules, and to simplify the writing of device drivers. 

□ Fully System V and BSD compatible tty(4) interface using STREAMS, and 
supporting all character sizes and parity settings. 



t 


An outgrowth of research at U.C, Berkeley, BSD stands for Berkeley Software Distribution. 
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□ System V compatible archive utility ar(lV). 

□ System V batch utilities and job scheduling facilities: at(l), bat ch(l), 
cron(l), and crontab(l). 

□ Access to Sun’s value-added libraries (SunView for example) from inside 
System V programs. 

A Brief History In early 1985, AT&T released the System V Interface Definition (SVID). This was 

a major step because it made explicit exactly what was standard about System V, 
and by omission, what was not In late 1985, Sun and AT&T agreed to work 
together to converge the two major strands of UNIX into a single system. In late 
1986, Sun’s Release 3.2 combined System V with 4.2 BSD, including almost full 
Base Level compatibility. Now in early 1988, SunOS 4.0 offers full Base Level 
compatibility, plus compatibility with additional SVID features. 

System V programs that are upwards compatible with those in 4.x BSD have 
already been added to the regular system directories. For example, 
/usr/bin/sh is the new Bourne shell, and /usr/bin/make is backward 
compatible System V enhancements. 

Programs that existed only on System V have been added to a regular system 
directories as well. For example, the text manipulation programs cut(l) and 
paste(l) both reside in /usr/bin. 

System V programs that are incompatible with those in 4.x BSD reside in the 
directory /usr/5bin. For example, /usr/5bin/stty has an entirely dif¬ 
ferent set of options from /usr/bin/stty. If you want to use System V pro¬ 
grams by preference, simply include /usr / 5bin early in your path, as in these 
lines from the . login or .profile files: 

(csh) SQt path = (/usr/5bin /usr/bin /usr/ucb .) 

(sh) PATH=/usr/5bin:/usr/bin:/usr/ucb:: 
export PATH 

V_> 


How the Compatibility Toots 
Work 


The directories /usr/5bin, /usr/51ib, and /usr/Sinclude contain 
material that has not yet been converged. Libraries and include files for compil¬ 
ing System V software reside in /usr/51ib and /usr/Sinclude respec¬ 
tively. If you want to compile a program written for System V, don’t use 
/usr/bin/cc but rather /usr/5bin/cc, which will read all the correct 
include files and load the correct libraries. You may want to make an alias or 
shell function that invokes the System V compiler, to obviate the need for chang¬ 
ing your PATH: 

-- 

(csh) alias cc5 /usr/5bin/cc 

(sh) cc5() .{ 

/usr/5bin/cc $* 

} 

___ j 
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u 

The directories that constitute the System V compatibility package are optional, 
requiring several megabytes of disk space. The suninstall(8) program lets' 
you decide whether or not to load these directories. 

The Group Mechanism Sun’s Release 3.2 used the group mechanism from BSD rather than from System 

V. SunOS 4.0, by contrast, allows both group mechanisms to work together. 
When the GIDset- bit is set on a directory, a file created in that directory will be 
assigned the directory’s GID (BSD semantics). Otherwise, it will be assigned the 
effective GID of the creating process (System V semantics). 

In either case, the GIDset- bit will be set when mkdir(2) creates new directories. 
Users will be able to set up their login directories to follow the semantics they 
prefer. SunOS 4.0 distribution tapes are shipped with the GIDset- bit set on all 
directories, thereby giving BSD semantics as the default. When you install 
SunOS 4.0, if you want to mount old filesystems and have them act as they did in 
the past, type the following command line for each mounted file system: 

r~ -----^ 

# find rnoanted^directojy -type d -exec chntod g+s ”(}" 

'... . .... . ; 

To set System V semantics on some portion of the installed system, use g-s 
instead of g+s in the above command line. There is a mount option called 
grpid that always provides BSD semantics. This option may be needed when a 
,/ SunOS 4.0 client mounts a file system from a server that has not yet been 

upgraded to SunOS 4.0. 

Compatibility of System Calls For security reasons, the system call chown () requires root privilege. On Sys¬ 
tem V, by contrast, the owner of a file may change its ownership. This would 
make the quota mechanism completely unenforceable. 

The system call utime {) now allows file time stamps to be set by any process 
with write permission on the file. 

The system call kill {) may now send signals to any process with an effective 
or real UID that matches the effective or real UID of the sender. As before, root 
processes may send signals to any process. 

The system call mknod () may now be used to create directories, although the 
system call mkdir () is preferred. 

With the system call f cnt 1 (), die flags F_SETFL and 0_NDELAY differ 
between the include files in /usr/include and /usr/5include. In either 
case they should do the right thing. For BSD, they affect all references to the 
imderlying file. For System V, they affect only file descriptors associated with 
the same file table entry. 

The terminal driver now supports 5-bit and 6-bit characters, and arbitrary settings 
of VMIN and VTIME. However, the default erase and kill characters are not # 
and @ but rather 1 Delete I and I Control-U I . 
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3.2. SVID Compliance in 
SunOS 4.0 


The Venn diagrams in this section demonstrate how SunOS 4.0 complies with 
release 3 of the System V Interface Definition SVID).( 


Figure 3-1 SVID Base System OS Service Routines 


SVID-compliant in SunOS 4.0 
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Figure 3-3 SVID Kernel Extension OS Service Routines 
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Figure 3-6 SVID Administered Systems Extension Utilities 
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Figure 3-9 SVID Terminal Interface Extension Utilities 


SVID-compIiant in SunOS 4.0 

^ tic put 


Figure 3-10 SVID Terminal Interface Extension Library Routines 


SVID-compliant in SunOS 4.0 



erase () 

newpad () 

scr restore () 

addch() 

erasechar {) 

newtermO 

scroll 0 


addstr {) 

fixterm() 

newwin{) 

scrollok 0 

wattroff () 

attrof f () 

flash () 

nl() 

set_term() 

wattron () 

attron() 

f lushinp () 

nocbreak () 

setscrregO 

wattrset () 

attrset () 

getbegyx() 

nodelay 0 

settermO 

wclear 0 

baudrate () 

getch 0 

noecho() 

setuptermO 

wclrtobot 0 

beep() 

getmaxyx() 

nonl-O 

slk_** 0 

wclrtoeol () 

box () 

getstr 0 

noraw 0 

standend () 

wdelch () 

cbreakO 

gettmode() 

overlay 0 

standout () 

wdeleteln() 

clear 0 

getyxO 

overwrite () 

snbpad{) 

wechochar 0 

clearok() 

halfdelay 0 

pechochar () 

subwin() 

werase {) 

clrtobot 0 

has_ic () 

pnoutrefresh 0 

tgetent () 

wgetch () 

clrtoeol0 

has_il 0 

prefresh () 

tgetflag () 

wgetstr () 

copywin() 

idlok() 

printw 0 

tgetnum 0 

winch 0 

def_prog_m*() 

inch() 

putp() 

tgetstr 0 

winsch () 

def_shell_m*() 

initscr() 

raw() 

tgoto () 

winsertln 0 

delay_output() 

insch 0 

refresh () 

touchline() 

wmove() 

delch 0 

insertln () 

reset_prog_m*() 

touchwin 0 

wnoutrefresh 0 

deleteln 0 

intrflush () 

reset_shell_m*() 

tparmO 

wprintw {) 

delwin () 

keyname() 

resettermO 

tputs 0 

wrefresh () 

doupdate() 

keypad() 

resetty 0 

typeahead 0 

wscanw 0 

echo() 

killchar() 

savetermO 

unctrl0 

wsetscrreg 0 

echochar) 

leaveok() 

savetty() 

vidattr () 

wstandend 0 

endwin () 

longname() 

scanw 0 

vidputs 0 

wstandoutO y 


move() 

scr_dump() 

waddch() 

A 

V 

mv**() 

scr_init () 

waddstr 0 

yy 


microsystems 


Revision A of 9 May 1988 



48 Programming Utilities and Libraries 


Figure 3-11 SVID Open Systems Networking Interfaces Library Routines 
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Shared Libraries 




Operating systems like SunOS have long achieved more efficient use of memory 
by sharing a single physical copy of a program’s text (code) among the processes 
executing it. But while the text of a program may be shared among its con¬ 
current invocations, a significant portion of that text, consisting of library rou¬ 
tines, may be duplicated as part of other running programs. For example, 
widely-used library functions such as print f () may be replicated any number 
times throughout memory, and again in various executables throughout the file 
system. This suggests that still-greater efficiencies can be had by sharing text at 
the library level whenever possible. 

The SunOS shared library mechanism improves resource utilization in a way that 
is both straightforward and flexible: 

□ No specialized kernel support is required; it uses the standard memory¬ 
mapping and copy-on-write features provided by the itimap(2) system call 
and the kernel memory management facilities. 

□ It is designed to minimize the burdens placed on users of existing code. In 
particular: 

• Shared libraries are transparent to the programs that use them, as well as 
the build procedures for those programs. 

• They are largely transparent to standard system utilities, including 
debuggers. 

• Shared libraries are transparent to library source code written in C. 
However, some special procedures are necessary when building the 
shared libraries themselves. 

• The allocation of address space for shared library routines is handled 
automatically. 

• Unlike statically-linked executables, programs that rely on shared 
libraries need not be rebuilt if an underlying library changes (so long as 
that library’s calling interface remains compatible). 

• The use of shared libraries is not required. You can specify the static 
version of a SunOS shared library as desired. 

In addition, shared libraries enhance the development environment by making it 
easier to modify and test compatible updates to library functions. 
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4.1. Definitions 
Shared Object 

Shared Library 


Static vs. Dynamic Link 
Editing 


Position Independent Code 
(PIC) 

Static and Dynamic Link 
Editors 


4.2. Using Shared Libraries 


Building a Program to Use 
Shared Libraries 


A shared object, or . so file, is an a. out(5) format file produced by ld(l). A 
shared object differs from a runnable program iii that it lacks an initial entry 
point. 

A ‘ ‘shared library’ ’ is a shared object file that is used as a library. In cases where 
the shared library exports initialized data, the shared object (. so) may be paired 
with an optional “data interface description’’ (. sa) file. (See Building a Shared 
Library, below, for details.) 

Link editing is the set of operations necessary to build an executable program 
from one or more object files. Static linking indicates that the results of these 
operations are saved to a file. Dynamic linking refers to these same link-edit 
operations when performed at run-time; the executable that results from dynamic 
linking appears in the running process, but is not saved to a file. 

Position-Independent code (PIC) requires link editing only to relocate references 
to objects that are external to the current object module. Position-independent 
code is readily shared. 

The link-editing facilities of Id have been made available for use at run-time as 
well as at compile-time. At compile time, the static link editor. Id, can build an 
executable file in which some symbols remain unresolved. An executable 
(a. out) file that contains unresolved symbols is said to be incomplete. Incom¬ 
plete executables require dynamic link editing at run-time. 

The dynamic link editor, /usr/lib/ld. so, uses the system’s memory 
management facilities to map in and bind the shared object files that are required 
at run-time, and performs the link editing operations that were deferred by Id. 

As long as the text bound-in at run-time is not subsequently modified (say, by a 
link-edit operation or an update to initialized external data), it remains shared 
among the various (disparate) programs that use it. However, if the text of a 
shared routine should need to be modified by a process during the course of exe¬ 
cution, local (exclusive) copies of the affected pages are created and maintained. 

For the application developer, the decision to use shared libraries is made at the 
static linking phase, when running Id. By default, if a shared version of a library 
is available, Id constructs an executable that uses the shared version. 

Id combines a variety of object files to produce an executable (a. out) file. 
Exactly what code gets produced, and how complete the a. out is, depends on 
the command-line options and input files supplied as arguments on the command 
line. Id simply defers the resolution of any symbols that remain after it has run 
out of definitions, and assumes that the programAvill be fully linked by Id. so at 
run-time. Id accepts as input: 

□ Simple object files. Id simply concatenates (and links) . o files in the order 
that they are encountered. 
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O ar(l) libraries. Each . a file is searched exactly once as it is encountered, 
and only those definitions that match an unresolved external symbol are 
extracted, concatenated to the text (or data), and linked. 

□ Shared objects. Any . so encountered is searched for symbol definitions 
and references, but does not normally contribute to the concatenated text 
(see Binding cf PIC with non-PIC, for exceptions having to do with Id’s - 
dc option). However, the occurrence of each shared object is noted in the 
resulting a. out file; fliis information is used by Id. so to perform 
dynamic link editing at run-time. 

Id’s output can be one of two basic types: 

o An “executable” (a. out) file. This file is either a program, if it has an 
entry point, or a shared object (. so), if it does not. 

□ Another “simple object” (. o) file. When given the -r flag. Id combines 
the input object files to form a single, larger one. (This is a special use for 
Id which is of little relevance to shared libraries.) 

You can indicate which libraries are to be used by supplying a —Iname option on 
the Id command line for each. Id searches each library in the order specified. 
The name string is an abbreviated version of the library’s filename; the full name 
is of the form 'lihname. a’ if in archive format, or "Hhname . so. version’ if 
it is in shared object form, (see Version Control below, for a detailed discussion 
of the version suffix). At Id-time, this version information is noted; it must be 
matched properly for successful binding at run-time by Id. so. 

The location of the library specified by a -1 option is determined by an ordered 
list of directories in which to search called the library search path. This search 
path is specified as follows. First, the value of the LD_LIBRARY_PATH 
enviromnent variable (a colon-separated list of pathnames). Then, any and all 
directories specified with -Ldirectory options. And finally, the (default) direc¬ 
tories /usr/lib and/usr/local/lib. 

Each directory supplied with —L is recorded for Use when the program is exe¬ 
cuted, as are the default directories. Directory search information obtained from 
LD_LIBRARY_PATH is not recorded in this manner. However, the search path 
that LD_LIBRARY_PATH contains at run-time is searched at that time; this 
allows an alternate set of libraries to be used. 

At Id-time, the library search is satisfied by the first occurrence of either form of 
the library (. so or . a if no . so is found), but if both versions are found in the 
same directory, the . so form is used by default. However, the choice of whether 
a . so or . a version is used by Id can be controlled by the binding mode 
options described in the next sectioa 
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Binding Mode Options 

-Bstatic and -Bdynamic You can specify the binding mode by supplying one of the -Bkeyword options 

on the command line: 

-Bdynamic Allow dynamic binding, do not resolve symbolic references, 
and allow creation of execution-time symbol and relocation 
information. This is the default setting. Note that Id records 
the name of the . so file with the highest version number in 
the executable. 

—Bstatic Force static binding, this mode is also implied by options that 
generate non-sharable executable formats. 

-Bdynamic and -Bst at ic may both be specified a number of times to toggle 
the binding mode for specific libraries. Like -1, their influence is dependent 
upon their location in the command line. Libraries that appear after a - 
Bstatic are linked statically. Libraries that appear after a -Bdynamic are 
treated as shared (when a shared version is available). 

NOTE Since -Bdynamic is the default setting, the use of shared libraries in the con¬ 

struction of a program thus ‘falls out” from installing the . so in Id's library 
search path. 

If -Bstatic is in effect. Id refuses to use the . so form of a library; it contin¬ 
ues searching for an equivalent library with the . a suffix, and an explicit request 
to load a . so file is treated as an error. 


The following example shows how -Bstatic and -Bdynamic can be used to 
use selected shared and static libraries. This cc command: 


r 

CC -o test test.c -Isuntool -Bstatic -Isunwindow -Bdynamic -Isunwindow -Ipixrect 


generates the Id command: 


/bin/ld -dc -dp -e start -X -o test /usr/lib/crtO.o test.o -Bstatic -Isuntool \ 
-Bdynamic -Isunwindow -Ipixrect -Ic 



Since -Bstatic turns off the use of shared libraries. Id finds the static (. a) 
suntool library and uses it for link editing immediately. The subsequent - 
Bdynamic option tells Id to use shared versions of the sunwindow, pix- 
rect and C libraries, if available. 


-N and -n Options for Id 


The Id options -N and -n instruct Id to build a non-pageable executable. Their 
use implies a -Bstatic option. 
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Binding of PIC with Non-PIC 


-dc and -dp Options 


As noted m the above example, the cc command generates an Id command with 
the -dp and -dc options. These options are included to facilitate binding of 
non-PIC code (generated by default) with the PIC shared libraries that a program 
might use. The bindings of interest are to: 

□ commons, (externs): allocated after the program is completely assembled 
(-dc); 


o initialized data: imported from the shared libraries (-dc); and 

□ entry points: supplied by the shared libraries (-dp). 

Without special handling, references to these objects would require execution¬ 
time link editing, resulting in unsharable code. To improve die degree of sharing 
for such programs, -dc and -dp force the allocation of commons and the crea¬ 
tion of aliases for library entry points, respectively. These allocations and aliases 
are created as part of the non-PIC executable, and result in prograrrK that are con¬ 
sidered to be “pure-text” non-PIC programs, even though they may require 
dynamic link editing. 

NOTE While it is possible to invoke the Id command directly, it is generally better 

practice to rely on the compiler-driver (such as cc) to generate the appropriate 
Id command, so as to remain insulated from any future changes in the compila¬ 
tion environment. Compiler commands such as cc accept and pass on options to 


Use of Assertions 


The assert Option To help detect any potential sharability or correctness problems, Id can validate 

certain assertions about an executable that it builds. This assertion checking is 
invoked by the —assert keyword ’ option, where keyword is one of: 

definitions if the resulting program were ran now, there would be no run¬ 
time tmdefined symbol diagnostics. This assertion is set by 
default, and is sufficient for validating applications that make 
use of shared libraries. 

pure-text the resulting executable requires no further relocations to its 
text. The code of a shared library should be validated using 
this assertion. 

At run-time. Id. so finishes the job started by Id. That is, it performs the link¬ 
editing operations needed to resolve a program’s remaining references using 
shared-library code and data. Id. so’s first task is to find and map in the 
required libraries. It applies the same library search rales as Id, looking first in 
the directories specified by the current vsXva of LD_LIBRARY_PATH, and then 
in the directories in the search path recorded by Id (the default directories and 
those specified by -L). In addition. Id. so attempts to find the “best” version 
of a shared library, that is, the version with the highest minor number (as 
described under Version Control below). 


Run-Time Use of Shared 
Libraries 
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SunOS Shared Libraries The shared libraries provided in SunOS are: 

□ The C library (both BSD and System V variants) 

□ Window libraries (suntool and sunwindow) 

□ pixrect 

□ kernel virtual memory access (kvm) 

Static (. a) versions of these libraries are also provided. 

There are some semantic differences between dynamic and static binding. These 
are not expected to cause a problem with programs that avoid questionable prac¬ 
tices with regard to library search order. However, there is a potential for prob¬ 
lems when programs are built from some components that have become dynami¬ 
cally loadable, while others remain static. Given the case where: 

hermes% Id -o x ...dc sc 

The executable x is composed of several objects, including a dynamic com¬ 
ponent, dc, and a static component, sc. dc was, prior to the introduction of 
shared libraries, an unordered archive file, and both dc and sc contain 
definitions for the symbol get sym. Suppose that dc contains a reference to 
getsym. If, in dc’s archive version, the definition for get sym preceded its 
reference. Id might have resolved that reference using the definition from sc. 
But in dc’s current (dynamic) form, its own definition is used instead. This is a 
result of the fact that at run-time. Id. so searches for a symbol definition start¬ 
ing with the main program, and then all . so’s in load order. Even tiiough it 
allows for an inconsistency of this sort, this behavior preserves the ability to 
interpose definitions on library entry points. 

Debuggers The SunOS debuggers have been modified to deal with the dynamic linking 

environment provided by the new Id. In particular, they understand that symbol 
definitions may appear after a program starts executing. However debugger users 
must be aware that library symbols will not be resolved until main () has been 
called, as the next example shows. 


Dynamic vs. Static Binding 
Semantics 
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Performance Issues 


. . . .. . . .. 

hemes! cc -g -o test test«c 

heme$% dbae test 

Reading symbolic infomation*., 

Read 40 symbols 
<dbx) stop in prlntf 

no. module, procedure or file named 'printf' 

(dbx) stop in main 

<1) stop in main 

<dbx) run 

Running; test 

stopped in main at line 4 in file "testae" 

4 printf(”%d 0,! ermo); 

<dbx) stop in printf 
(3) stop in printf 
<dbx) cont 

Stopped in printf at Dxed70&54 

0xed76954{ moveml #<d7,a5>,sp9 

Current function is main 

4 printf (’’%d 0, errno^; 


Users of debugging tools also need to be aware that core files have incomplete 
information on the state of shared code. Core files contain only the stack and 
data regions of a process image. The text, and more importantly, the static data 
regions of dynamically loaded objects do not appear. Thus, modifications made 
to initialized data are not reflected in the core file. 

Shared libraries represent a classic space vs. time trade-off. The work of incor¬ 
porating the library code into an address space is deferred in order to save both 
primary and secondary storage. Therefore, one can expect to pay a slight CPU 
time penalty with programs that use shared libraries. This penalty can be attri¬ 
buted to added cost of: 

□ dynamically loading the libraries, 

□ performing the link editing operations, and 

□ the execution of the library PIC code. 

However, these costs can be offset by the savings in I/O access time when library 
code is already mapped in by another program, since the (real) I/O time required 
to bring in a program and begin execution will be greatly reduced. As long as the 
CPU time required to merge the program and its libraries does not exceed the I/O 
time saved, the apparent performance of the program will be the same or better. 
However, if sharing does not occur, or if the system’s CPU is already saturated, 
such savings may not be achieved. 


Dependencies on Other Files 


A dynamically bound program consists not only of the executable file that is the 
output of Id, but also of the files referred to during execution. Moving a dynani- 
ically bound program may also involve moving a number of other files as well. 
Moving (or deleting) a file on which a dynamically bound program depends may 
prevent that program from functioning. 
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For those programs that execute with an effective UID (user ID) or GID (group 
ID) different than the real UID or GID, Id. so ignores libraries in directories 
other than /usr / lib and /usr/ 51ib in the search padi. 

A version numbering mechanism has been provided for shared libraries. This 
allows newer compatible versions of a library to be bound at run-time. It also 
allows the link editors to distinguish between compatible and incompatible ver¬ 
sions of a library. 

Version Numbers of . so’s The version number is composed of two parts, a major version, and a minor ver¬ 

sion number. This version-control suffix can be extended to an arbitrary string of 
numbers in Dewey-decimal format, although only the first two components are 
significant to the link editors at this time. 

As noted earlier, Id records the version number of the shared library in the exe¬ 
cutable it builds. When Id. so searches for the library at mn-time, it uses this 
number to decide which of the (possibly multiple) versions of a given library is 
“best,” or whether any of the available versions are acceptable. The rules it fol¬ 
lows are: 

□ Major Versions Identical: the major version used at execution time must 
exactly match the version found at Id-time. Failure to find an instance of 
the library with a matching major version will cause a diagnostic to be 
issued and the program’s execution terminated. 

□ Highest Minor Version: in the presence of multiple instances of libraries 
that match the desired major version, Id. so will use the highest minor ver¬ 
sion it finds. However, if the highest minor version found at execution time 
is lower than the version noted at Id-time, a warning diagnostic is issued. 

Major version numbers should be changed whenever there is an incompatible 
change to the library’s interface. 

NOTE As always, the detection of incompatibilities between library versions remains 
the responsibility of the library’s developer. 

Version Management Issues Whenever there is an incompatible change to the library’s calling interface, the 

major number of that library should be changed. A library’s interface is defined 
by: 

□ the names and types of exported functions and their parameters; and 

□ the names and types of exported data (initialized or not) 

Incompatible changes would include the deletion of a exported procedure, dele¬ 
tion of exported data, changes to an procedure’s parameter list, and changes to 
data structures declared in a . h file normally included by both the library and the 
applications that use it. 

Changes to internal library procedures and data do not constitute an interface 
change. 

Minor versions should be changed to reflect compatible updates to libraries. An 
example of a compatible update would be changing a procedure’s algorithm 
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without changing its parameter list. Although adding a new library routine con¬ 
stitutes an interface change, it can be considered a compatible change. 

Note that link-editors silently select the highest compatible version they can 
obtain. If the minor version used at Id-time is higher than the highest one found 
at run-time, then although the interfaces should remain compatible, it is possible 
that certain bug fixes or compatible enhancements on which the application 
depends might be missing: hence the warning message mentioned above. 

There is no single mechanism in SunOS that implements shared libraries. 

Instead, the ability to constmct a shared library comes as a consequence of 
enhancements to various existing facilities. The system components and their 
features that are instrumental in supporting shared libraries are: 

o Virtual memory supports file mapping and “copy-on-write” sharing 

□ PIC generation by the compiler and assembler 

a Link editor support for dynamic linking and loading 

Memory sharing is provided by the kernel’s virtual memory (VM) system. The 
mechanisms of interest for shared libraries are: 

o File mapping by way of iranap (). 

a Sharing at the granularity of a file page 

□ A per-page copy-on-write facility that allows run-time modification of a 
shared file, without affecting other users of that same file. 

The VM system uses these features internally, so that an exec () of a program 
is reduced to establishing a copy-on-write mapping of the file containing the pro¬ 
gram. A shared library is added to the address space in exactly the same way, 
using this general file-mapping mechanism. 

The C compiler’s -pic option generates position-independent code. When - 
pic is specified, references to objects that are external to the body of the code 
are made by way of linkage tables. These indirect references can degrade execu¬ 
tion performance slightly, depending on of the number of dynamic references to 
global objects. The code sequences generated often assume that the linkage 
tables are no larger than a limit that is convenient for the specific machine (64K 
bytes for an MC68000, or 8K for a SPARC, for instance). In the (presumably 
rare) event the tables require a larger size, the compiler can be coerced into gen¬ 
erating code sequences that permit larger linkage-table entries with the -PIC 
option. 

Code generated by the -pic option requires support from the assembler. T his 
support is enabled by the -k assembler fiag, and is generated automatically by 
cc when invoking the assembler for a compilation performed with the -pic or 
the -PIC option. 

User-written assembly code for use in a shared object must also be PIC. Refer to 
the appropriate Assembly Language Reference for your Sun system for details. 
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crt 0 () Every main program produced by the standard languages is linked with a pro¬ 

gram prologue module, crt 0 (). This module contains the program’s entry 
point, and performs various initializations of the environment prior to calling the 

program’s main () function. crtO () refers to the symbol_DYNAMIC. As 

described above, when Id builds an executable requiring execution-time link 
editing, it defines this symbol as the address of a data structure containing infor¬ 
mation needed for execution-time link editing operations. If the structure is not 
needed, any reference to the symbol_DYNAMIC is relocated to zero. 

At program start-up, crt 0 () tests to see whether or not the program being exe¬ 
cuted requires further link editing. If not, crt 0 () simply proceeds with the 
execution of the program as it always has - no further processing is involved. 

However, if_DYNAMIC is defined, crtO () opens the file 

/usr/lib/ld. so and requests the system to map it into the program’s 
address space via the mmap {) system call. It then calls Id. so, passing as an 

argument the address of its program’s_DYNAMIC structure. crtO () assumes 

that Id. so’s entry point is the first location in its text. When the call to Id. so 
returns, the link editing operations required to begin the program’s execution 
have been completed. 

Link Editors: Id and Id. so After Id has processed all of its input files, it attempts to resolve each symbolic 

reference to a relative offset within the executable being built. Id is able to 
complete this symbolic reduction at Id-time only if: 

o all information relating to the program has been given and no . so will be 
added at execution time or 

□ the program has an entry point and symbolic reduction can be made for 
those symbols defined in the program 

After performing all the reductions it can, if there are no further symbols to 
resolve, the output is a fully linked (static) executable. However, if any 
unresolved symbols remain, then the executable will require further link editing 
at run-time. In this case. Id deposits the information (including version number) 
needed to obtain any needed . so files, in the data space of the incomplete exe¬ 
cutable. 

It should be noted that uninitialized “common” areas (essentially all uninitial¬ 
ized C globals) are allocated by the link editor after it has collected all refer¬ 
ences. In particular, this allocation can not occur in a program that still requires 
the addition of information contained in a .so file, as the missing information 
may affect the allocation process. Initialized “commons,” however, are allo¬ 
cated in the executable in which their definition appears. 

After Id has performed all the symbolic reductions it can, it attempts to 
transform all relative references to absolute addresses. Id is able to do this "rela¬ 
tive reduction" only if it has been provided some absolute address. 
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At nin-time, after receiving control from crt 0 (), Id. so, executes a short 
bootstrap routine that performs any relocations Id. so itself requires. It then 

processes the information contained in the_DYNAMIC structure of the program 

that called it. Id. so examines the list of required dynamic objects Each ele¬ 
ment of the list contains an offset relative to Ae_DYNAMIC stracture of an 

array of link_ob j ect structures and has information to identity a . so that 
must be incorporated. The identification is the name specified on the Id com 
mand line used to build the program, and includes a bit indicating whether the 
object was named explicitly or via a -1 option. Some version control informa¬ 
tion is also recorded for each entry in the ld_need array. Id. so looks up the 
indicated file, and maps it into the process’s address space. 

After all modules comprising the program have been placed in the address space. 
Id. so attempts to resolve the remaining symbols. After performing allocations 
for all uninitialized commons Id. so attempts to resolve all unbound references 
that occur outside of procedure linkage tables. 

Unresolved procedural references in the linkage tables are not processed during 
program startup. Instead, such references are initialized such that the initial call 
results in a transfer of control to Id. so. When called in this way. Id. so first 
resolves the reference to an absolute address, and then modifies the linkage table 
entry to use that address. Deferring the binding of procedural entry-points until 
the first call eliminates unnecessary bindings to entry points that the program 
may never require. 

Shared In the simplest of cases, the commands needed to build a shared library might be: 

( --—-N 

herraes% cc -pic -c *.c 

herroes% Id -o Libx.so.I.l -assert pure-text *.o 


But note that this assumes that the library exports no initialized data. And it 
makes no guarantee that the library text makes the most efficient possible use of 
space, or allows for a minimal amount of paging. 

As noted earlier, a shared library should be structured to avoid undue 
modification in the course of dynamic linking and execution. Otherwise, it is 
possible that some or all of the shared text may be rendered unsharable when run. 
Although this lack of sharing would not effect the correct execution of library 
routines, it will impact system performance. If only a few programs use the 
library, this impact is small. But for a widely-used library, the impact on system 
performance could be significant. Thus, shared library objects should be PIC, 
they should be validated using the pure-text assertion, and those libraries 
that export initialized data should be accompanied by a data interface description 
(. sa) file. 

Components of shared library code should be PIC. For C source code, the objects 
must be built with the -pic or -PIC options. User-written assembly code 
requires that the programmer to follow the same rules that cc followed; details 
of specific coding sequences can be found in the appropriate Assembly Language 
Reference for a particular Sun system assembler. 
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Building the . 


The . sa File 


Building the 


so File To build the . so portion of a shared library, simply invoke Id with the list of 

object files that will comprise it. The version number is not automatically gen¬ 
erated by Id (which creates a file named a. out by default), but you can specify 
the full name of the library, including the version number, with Id’s -o option. 
It is strongly suggested that you use the -assert pure-text assertion to 
uncover any instances of non-PIC code. 

The . sa file is used to support Id’s -dc option, which provides a space/time 
efficient implementation of the interface between non-position-independent code 
and dynamically linked objects. The . so is an archive library that contains only 
the exported initialized data used by the shared library. When present, it is stati¬ 
cally linked at Id-time to insure correct allocation. 

A data item is exported from a library if a program that uses the library refers to 
the data item by name. The contents of the data item are included if they are 
specified by value in the declaration. For instance, with a definition of the form: 

char *strlist[] = { "string 1", "string 2" }; 

the data itself must be included in the . sa file, whereas with: 

struct *strlist[] = { ptrl, ptr2 }; 

definitions for the objects named ptrl and ptr2 would not necessarily have to 
be included. Note that if ptrl were itself defined as an initialized global in the 
library source, say: 

extern char *ptrl = NULL 

then this definition would also have to go into the . sa file. 

Uninitialized data (exported or not) is handled automatically, and need not be 
included in the . sa file. If the library does not export any data, then a . sa 
would be unnecessary. The full name of a . sa also includes a version number 
that must match the version string of the . so it accompanies. 

CAUTION Failure to create a . sa file where it is required risks feeding Id erroneous 
information about the nature of the interfaces to a library. Furthermore, 
failure to use a . sa can result in the application’s text segment becoming 
unsharable when run. 

For example, suppose a library’s source module contains global initialized data 
and is compiled with the -R option (merge initialized data with the text seg¬ 
ment). Furthermore, this data is not included in the . sa file. When a program 
references this data. Id, so assumes this a procedure and consequently, it will 
use this data item improperly. 

sa File To build a . sa file: 

1. Segregate the declarations of exported initialized data from the sources for 
each object, and place them in a separate source file. Make sure that an up- 
to-date object is compiled from each of those data-description sources, and 
include each of those data-description objects in both the static and shared 
versions of the library. 
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4.6. Building a Better 
Library 


Sizing Down the Data 
Segment 


2. Create a separate (static) archive library composed of only the data- 
description objects, and give it a name of the form 'Ixhname . sa. version’. 
This archive constitutes the . sa file. Be sure that the . sa has the same ver¬ 
sion number as the . so it is to accompany. 

3. Use ranlib(l) to incorporate a symbol table within the . sa archive. 

Library code that maximizes sharing is considered “better” because it makes 
more efficient use of the system’s memory resources. Building the library com¬ 
ponents PIC is an important and easy first step, but there are other tuning stra¬ 
tegies to consider as well. 


One way to maximize sharing is to minimize a . so’s data segment (containing 
initialized data), and its bss segment (containing uninitialized data). Often a 
.so’s data requirements are large because a significant portion of that data that 
is functionally read-only. There are several problems with this mix of read-only 
and modifiable data: 

o data that could be shared is not, 

o an unnecessary amount of swap space is reserved, and 

□ read-only data fragments the read-write storage, spreading it over more 
pages. 

One approach is to move initialized read-only data into the text segment. This is 
done by compiling with the -R option. However caution needs to be exercised, 
since initialized data structures that contain pointers require relocation at mn- 
time. 


For instance, given the declarations: 


----- 

\ 

void test 0 ; 


int X; 


struct fxy{ 


void (*p0)(); 


int *pl; 


}; 


struct fxy example = {test, &x}; 



7 


The references to &x and test are instances of pointers embedded in an initial¬ 
ized structure, and should be avoided in shared code. You can avoid problems of 
this sort by using an uninitialized pointer: 


struct fxy example; 

and an adding an initialization routine to set the value(s). 
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Using xst r to Extract String Another conunon example of initialized data containing pointers is an array of 

Definitions strings: 

char *errlist[] = {"errl", "err2"}; 

The xst r (1) utility can be used to make code containing initialized strings 
more sharable. It segregates the literal string data from its relocatable references, 
which allows the literal data to be merged safely into the text segment. However, 
files containing references to the string data should not he compiled with the -R 
option. 

If there are several related pieces of data, another strategy is to coalesce the 
smaller items into a larger structure and allocate the space from the heap. 


Better Ordering of Objects 


The order of the objects in the executable can be important to minimizing the 
memory requirements. Since objects are concatenated together, linking in the 
wrong order may result in a unnecessarily large memory requirement Two 
approaches that encourage better utilization of memory resources are: 


□ Routines that are frequently called should be packaged together, and isolated 
from startup or rarely-called code. 


□ 


A set of routines that represent a common sequence should also be packaged 
together. For example, given modules A, B, C, D, and E, where A and B fit 
on one VM page, C and D fit on another, and E fits on a partial page, if A 
always calls into E and never calls into B, the memory requirements may be 
reduced by a page if E follows A. 



crt 0 . o Dependency Sometimes a program will define its own crtO () initial routine. If it is 

intended that the program use shared libraries, then the programmer needs to pro¬ 
vide a hook for the run-time linker. Further discussion of this can be found under 
link(5) in the SunOS Rrference Manual. 

The Idconf ig Command Idconf ig(8) is a program used to construct a run-time linking cache for use by 

Id. so. The cache has a default list of directories /usr/lib, /usr/51ib, 
/usr/lib/fsoft,/usr/lib/£68881, /usr/lib/ffpa, and 
/usr/lib/f switch and will accept as input a list of additional directories to 
augment this list. Idconf ig records the pathname of the highest compatible 
version of each shared library in the specified search path. 

At runtime. Id. so first queries the cache to determine which is the best version 
of a library in a particular directoiy. If the cache is unable to satisfy the request. 
Id. so enumerates the directory entries for the best version. 
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4.7. Shared Library 
Problems 

Id. so Is Deleted 


Wrong Library Is Used 


Error Messages 


Since many system utilities are built to use shared libraries, and thus rely on 
dynamic link-editing, the potential exists for chaos if an important shared library 
(such as the C library) or /usr/lib/ld. so should be deleted. 

If the latter has been deleted, you will see the following message: 


crtO: no /usr/lib/ld.so 

V _ 

- -- — __ / 

To deal with the chaos resulting from either the shared C library or Id. so being 
deleted, a number of commands and utilities have been statically linked. These 
include: rcp(l) init(8), gettyCS), sh(l), csh(l), mv(l), ln(l), tar(l) and 
restore(8). Since most system utilities may be rendered unusable by this con¬ 
dition, it may be necessary to boot the system single-user in order to restore 
either /usr/lib/ld. so or the C library. 'Rsist io System and Network 
Administration for procedures to restore these files. 


Id. so will not detect a library that is newly installed in the cache unless the 
cache is rebuilt using Idconf ig. Thus, a program that depends on the newly- 
installed library may not be able to find it. You can use the Idd(l) command to 
identify the libraries on which a program depends. 


Id. so: Ixhname. so .major not found 


Id. so failed to find a library with the appropriate major version number. 


Id.so: open error for library 

Id.so: can't read struct exec for library 

Id.so: library is not for this machine type 


Either the shared object has been corrupted, has incorrect access permissions, or 
was built to execute on another processor architecture. 

-—---—-- 

Id.so: call to undefined procedure symbol from address 
Id.so: Undefined symbol symbol 


These messages generally indicate that the execution path attempts to refer to an 
undefined symbol. This is usually the result of a progr ammin g error. 

---- 

Id.so: warning library has older version than expected 
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n 

/ 

The version of the shared library that is currently being used has a minor version 
number that is lower than the version that was present at the time the application 
was compiled. 


o 
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lint — a Program Verifier for C 


lint examines C source programs, detecting a number of bugs and obscurities, 
lint enforces the type rules of C more strictly than the C compiler, lint may 
also be used to enforce a number of portability restrictions involved in moving 
programs between different machines and/or operating systems. Another option 
detects a number of wasteful, or error-prone, constructions which nevertheless 
are, strictly speaking, legal. 

lint accepts multiple input files and library specifications, and checks them for 
consistency. 

The separation of function between lint and the C compilers has both historical 
and practical rationale. The compilers turn C programs into executable files 
rapidly and efficiently. This is possible in part because the compilers do not do 
sophisticated typ)e checking, especially between separately compiled programs, 
lint takes a more global, leisurely view of the program, looking much more 
carefully at the compatibilities. 

Thrs document discusses the use of lint, gives an overview of its implementa¬ 
tion, and gives some hints on writing machine-independent C code. 

Using lint Suppose there are two C source files,and file2,c, which are ordinarily 

compiled and loaded together. The command: 



produces messages describing inconsistencies and inefficiencies in the programs, 
lint enforces the typing rules of C more strictly than the C compiler (for both 
historical and practical reasons) enforces them. The command: 



produces, in addition to the types of messages described above, additional mes¬ 
sages relating to pbrtability of the programs to other operating systems and 
machines. Replacing the —p by —h produces messages about various error-prone 
or wasteful constructions which, strictly speaking, are not bugs. Saying -hp gets 
the whole works. 
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The next several sections describe the major messages; the document closes with 
sections discussing the implementation and giving suggestions for writing port¬ 
able C. There is a summary of lint options in section lint Options. 

Many of the facts which lint needs may be impossible to discover. For exam¬ 
ple, whether a given function in a program ever gets called may depend on the 
input data. Deciding whether exit is ever called is equivalent to solving the 
famous ‘halting problem,’ which is known to be recursively undecidable. 

Thus, most of the lint algorithms are a compromise. If a function is never 
mentioned, it can never be called. If a function is mentioned, lint assumes it 
can be called; this is not necessarily so, but in practice is quite reasonable. 

lint tries to give information with a high degree of relevance. Messages of the 
form 'xxx might be a bug’ are easy to generate, but are acceptable only in propor¬ 
tion to the fraction of real bugs they uncover. If this fraction of real bugs is too 
small, the messages lose their credibility and serve merely to clutter up the out¬ 
put, obscuring the more important messages. 

Keeping these issues in mind, we now consider in more detail the classes of mes¬ 
sages which lint produces. 

As programs evolve and develop, previously used variables and arguments to 
functions may become unused; it is not uncommon for external variables, or even 
entire functions, to become unnecessary, and yet not be removed from the source. 
These ‘errors of commission’ rarely make working programs fail, but they are a 
source of inefficiency, and make programs harder to understand and change. 
Moreover, information about such unused variables and functions can occasion¬ 
ally serve to discover bugs; if a function does a necessary job, and is never 
called, something is wrong! 

lint complains about variables and functions which are defined but not other¬ 
wise mentioned. An exception is variables which are declared through explicit 
extern statements but are never referenced; thus the statement: 

extern float sin(); 

will evoke no comment if jm is never used. Note that this agrees with the 
semantics of the C compiler. In some cases, these unused external declarations 
might be of some interest; they can be discovered by adding the —x option to the 
lint invocation. 

Certain styles of programming require many functions to be written with similar 
interfaces; frequently, some of the arguments may be unused in many of the 
calls. The —v option is available to suppress the printing of complaints about 
unused arguments. When —v is in effect, no messages are produced about unused 
arguments except for those arguments which are unused and also declared as 
register arguments; this can be considered an active (and preventable) waste of 
the register resources of the machine. 


5.3. Unused Variables and 
Functions 


5.2. A Word About 
Philosophy 
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There is one case where information about unused, or undefined, variables is 
more distracting than helpful. This is when lint is applied to some, but not all, 
files out of a collection which are to be loaded together. In this case, many of the 
functions and variables defined may not be used', and, conversely, many func¬ 
tions and variables defined elsewhere may be used. The -u option may be used 
to suppress the spurious messages which might otherwise appear. 

5.4. Set/Used Information lint attempts to detect cases where a variable is used before it is set This is 

very difficult to do well; many algorithms take a good deal of time and space, 
and still produce messages about perfectly valid programs, lint detects local 
variables (automatic and register storage classes) whose first use appears physi¬ 
cally earlier in the input file than the first assignment to the variable. It assumes 
that taking the address of a variable constitutes a ‘use,’ since the actual use may 
occur at any later time, in a data-dependent fashion. 

The restriction to the physical appearance of variables in the file makes the algo¬ 
rithm very simple and quick to implement, since the tme flow of control need not 
be discovered. It does mean that lint can complain about some programs 
which are legal, but these programs would probably be considered bad on stylis¬ 
tic grounds (for example, might contain at least two goto’s). Because static and 
external variables are initialized to 0, no meaningful information can be 
discovered about their uses. The algorithm deals correctly, however, with initial¬ 
ized automatic variables, and variables which are used in the expression which 
first sets them. 

The set/used information also permits recognition of those local variables which 
are set and never used; these form a frequent source of inefficiencies, and may 
also be symptomatic of bugs. 

5.5. Flow of Control lint attempts to detect unreachable portions of the programs which it 

processes. It Complains about unlabeled statements immediately following 
goto, break, continue, or return statements. An attempt is made to 
detect loops which can never be left at the bottom, detecting the special cases 
while ( 1 ) and for (;;) as infinite loops, lint also complains about 
loops which cannot be entered at the top; some valid programs may have such 
loops, but at best they are bad style, at worst bugs. 

lint has an important area of blindness in the flow of control algorithm: it has 
no way of detecting functions which are called and never return. Thus, a call to 
exit may cause unreachable code which lint does not detect; the most serious 
effects of this are in the determination of returned function values (see the next 
section). 

One form of urureachable statement that lint does not complain about is a 
break statement that carmot be reached — programs generated by yacc, and 
especially lex, may have literally hundreds of unreachable break statements. 
The -O option in the C compiler often eliminates the resulting object code 
inefficiency. Thus, these unreached statements are of little importance — there is 
typically nothing the user can do about them, and the resulting messages would 
clutter up the lint output. If these messages are desired, lint can be invoked 
with the —b option. 
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5.6. Function Values Sometimes functions return values which are never used; sometimes programs 

incorrectly use function ‘values’ which are never returned, lint addresses this 
problem in a number of ways. 

Locally, within a function definition, tiie appearance of both: 
return ( expr ) ; 

and: 


return; 

statements results in the message 

function name contains return( grpr ) and return 

The most serious difficulty with this is detecting when a function return is 
implied by flow of control reaching the end of the function. This can be seen 
with a simple example: 

-\ 

f ( a ) { 

if ( a ) 

return (3 ); 

g ( ); 

) 

\ _ ^ _ > 


Notice that, if a tests false,/will call g and then return with no defined return 
value; this will trigger a complaint from lint. If g, like exit, never returns, 
the message will still be produced when in fact nothing is wrong. 

In practice, some potentially serious bugs have been discovered by this feature; it 
also accounts for a substantial fraction of the ‘noise’ messages produced by 
lint. 

On a global scale, lint detects cases where a function returns a value, but this 
value is sometimes, or always, unused. When the value is always unused, it may 
constitute an inefficiency in the function definition. When the value is some¬ 
times unused, it may represent bad style (for example, not testing for error condi¬ 
tions). 

The dual problem, using a function value when the function does not return one, 
is also detected. This is a serious problem. Amazingly, this bug has been 
observed on a couple of occasions in ‘working’ programs; the desired function 
value just happened to have been computed in the function return register! 

5.7. Type Checking lint enforces the type checking rules of C more strictly than the compiler does. 

The additional checking is in four major areas: across certain binary operators 
and implied assignments, at the stmcture selection operators, between the 
definition and uses of functions, and in the use of enumerations. 

There are a number of operators which have an implied balancing between types 
of the operands. The assignment, conditional (?:), and relational operators have 
this property; the argument of a return statement, and expressions used in ini¬ 
tialization also suffer similar conversions. In these operations, char, short. 
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int, long, unsigned, float, and double types may be freely intermixed. 
The types of pointers must agree exactly, except that arrays of x’s can, of course, 
be intermixed with pointers to x’s. 

The type checking rules also require that, in structure references, the left operand 
of the —> be a pointer to structure, the left operand of the. be a stmcture, and 
the right operand of these operators be a member of the structure implied by the 
left operand. Similar checking is done for references to unions. 

Strict rules apply to function argument and return value matching. The types 
float and double may be freely matched, as may the types char, short, 
int, and unsigned. Also, pointers can be matched with the associated arrays. 
Aside from this, all actual arguments must agree in type with their declared coun¬ 
terparts. 

With enumerations, checks are made that enumeration variables or members are 
not mixed with other types, or other enumerations, and that the only operations 
applied are =, initialization, ==, !=, and function arguments and return values. 

5.8. Type Casts The type casting feature in C was introduced largely as an aid to producing more 

portable programs. Consider the assignment: 

P = 1 / 

where/? is a character pointer, lint will quite rightly complain. Now, consider 
the assignment 

p = (char ♦)! ; 

in which a cast has been used to convert the integer to a character pointer. The 
programmer obviously had a strong motivation for doing this, and has clearly 
signaled his intentions. It seems harsh for lint to continue to complain about 
this. On the other hand, if this code is moved to another machine, such code 
should be looked at carefully. The —c option controls the printing of comments 
about casts. When —c is in effect, casts are treated as though they were assign¬ 
ments subject to complaint; otherwise, all legal casts are passed without com¬ 
ment, no matter how strange the type mixing seems to be. 

5.9. Nonportable In some implementations, characters are signed quantities, with a range from 

Character Use -128 to 127. In other C implementations, characters take on only positive 

values. Thus, lint will mark certain comparisons and assignments as being 
illegal or nonportable. For example, the fragment: 


char c; 

if( (c = getchar( )) < 0 ) . . . 

--- ^ _ _ _/ 

works on the PDP-11, but will fail on machines where characters always take on 
positive values. The real solution is to declare c an integer, since getchar is actu¬ 
ally returning integer values. In any case, lint will say ‘nonportable character 
comparison’. 
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A similar issue arises with bitfields; when assignments of constant values are 
made to bitfields, the field may be too small to hold the value. This is especially 
tme because on some machines bitfields are considered as signed quantities. 
While it may seem unintuitive to consider that a two-bit field declared of type 
int caimot hold the value 3, the problem disappears if the bitfield is declared to 
have type unsigned. 

Bugs may arise from the assignment of a long to an int, which may lose accu¬ 
racy. This may happen in programs which have been incompletely converted to 
use typedef s. When a typedef variable is changed from int to long, the 
program can stop working because some intermediate results may be assigned to 
int’s, losing accuracy. Since there are a number of legitimate reasons for 
assigning longs to ints, the detection of these assignments is enabled by the 
—a option. 

lint flags several perfectly legal, but somewhat strange, constmctions — it is 
hoped that the messages encourage better code quality, clearer style, and may 
even point out bugs. The -h option is used to enable these checks. For example, 
in the statement: 

*P++ ; 

the * does nothing; this provokes the message ‘null effect’ from lint. The pro¬ 
gram fragment: 

unsigned x ; if( x < 0 ) . . • 
is clearly somewhat strange; the test will never succeed. Similarly, the test: 

if( X > 0 ) . . . 

is equivalent to: 

if( X != 0 ) 

which may not be the intended action, lint will say ‘degenerate unsigned com¬ 
parison’ in these cases. If one says: 

if( 1 != 0 ) . . . 

lint reports ‘constant in conditional context’, since the comparison of 1 with 0 
gives a constant result. 

Another construction detected by lint involves operator precedence. Bugs 
which arise from misunderstandings about the precedence of operators can be 
accentuated by spacing and formatting, making such bugs extremely hard to find. 
For example, the statements: 

if( X&077 == 0 ) . . . 


5.11. Strange 

Constructions 


5.10. Assignments of Longs 
to Ints 


or 


x«2 + 40 


probably do not do what was intended. The best solution is to parenthesize such 
expressions, and lint encourages this by an appropriate message. 
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Finally, when -h option is in foijce lint complains about variables which 
are redeclared in inner blocks in a way that conflicts with their use in outer 
blocks. This is legal, but is considered by many (including the author) to be bad 
style, usually unnecessary, and frequently a bug; 

5.12. Pointer Alignment Certain pointer assignments may be reasonable on some machines, and illegal on 

others, due entirely to alignment restrictions. For example, on the PDP-11, it is 
reasonable to assign integer pointers to double pointers, since double-precision 
values may begin on any integer boundary. On the Honeywell 6000, double¬ 
precision values must begin on even word boundaries; thus, not all such assign¬ 
ments make sense, lint tries to detect cases where pointers are assigned to 
other pointers, and such alignment problems might arise. The message ‘possible 
pointer alignment problem’ results from this situation whenever either the —p or 
—h options are in effect. 

5.13. Multiple Uses and In complicated expressions, the best order in which to evaluate subexpressions 

Side Enects may be highly machine-dependent. For example, on machines (like the PDP-11) 

in which the stack runs backwards, function arguments will probably be best 
evaluated fiom right-to-left; on machines with a stack running forward, left-to- 
right seems most attractive. Function calls embedded as arguments of other 
functions may or may not be treated similarly to ordinary arguments. Similar 
issues arise witii other operators which have side effects, such as the assignment 
operators and the increment and decrement operators. 

In order that the efficiency of C on a particular machine not be unduly comprom¬ 
ised, the C language leaves the order of evaluation of complicated expressions up 
to the local compiler, and, in fact, the various C compilers have considerable 
differences in the order in which they will evaluate complicated expressions. In 
particular, if any variable is changed by a side effect, and also used elsewhere in 
the same expression, the result is explicitly undefined. 

lint checks for the important special case where a simple scalar variable is 
affected. For example, the statement: 

a[i] = b[i++] / 
will draw the complaint: 

warning: i evaluation order undefined 

5.14. Implementation lint consists of two programs and a driver. The first program is a version of 

the Portable C Compiler, which is the basis of many C compilers, including 
Sun’s. This compiler does lexical and syntax analysis on the input text, con¬ 
structs and maintains symbol tables, and builds trees for expressions. Instead of 
writing an intermediate file which is passed to a code generator, as the compilers 
do, lint produces an intermediate file which consists of lines of ASCII text 
Each line contains an external variable name, an encoding of the context in 
which it was seen (use, definition, declaration, etc.), a type specifier, and a source 
file name and line number. The information about variables local to a function or 
file is collected by accessing the symbol table, and examining the expression 
trees. 
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Comments about local problems are produced as detected. The information 
about external names is collected onto an intermediate file. After all the source 
files and library descriptions have been collected, the intermediate file is sorted to 
bring all information collected about a given external name together. The 
second, rather small, program then reads the lines from the intermediate file and 
compares all of the definitions, declarations, and uses for consistency. 

The driver controls this process, and is also responsible for making the options 
available to both passes of lint. 

5.15. Portability Many C programs have been successfully ported to a wide variety of operating 

systems, partly as a result of the lint features that increase portability. While 
there is no guarantee that a given C program will run uiunodified within a dif¬ 
ferent system environment, passing it through lint identifies and eliminates 
many potential portability problems. 

For instance, uninitialized external variables are treated differently in different 
implementations of C. Suppose two files both contain a declaration without ini¬ 
tialization, such as: 

int a ; 

outside of any function. The loader resolves these declarations, and sets aside 
only a single word of storage for a. Under the GCOS and IBM implementations, 
this is not feasible (for various stupid reasons!) so each such declaration sets 
aside a word of storage called a. When loading or library editing takes place, this 
creates fatal conflicts which prevent the proper operation of the program, lint 
detects such multiple definitions if it is invoked with the -p option. 

A related difficulty comes from the amount of information retained about exter¬ 
nal names during the loading process. On the SunOS system, externally known 
names have seven significant characters, with the upper/lower case distinction 
kept. On tire IBM systems, there are eight significant characters, but the case dis¬ 
tinction is lost. On GCOS, there are only six characters, of a single case. This 
leads to situations where programs run on one system, but encounter loader prob¬ 
lems on others. lint-p maps all external symbols to one case and truncates 
them to six characters, providing a worst-case analysis. 

A number of differences arise in the area of character handling: characters in 
SunOS are eight bit ASCII, while they are eight bit EBCDIC on the IBM, and nine 
bit ASCII on GCOS. Moreover, character strings go from high to low bit positions 
(‘left to right’) on GCOS and IBM, and low to high (‘right to left’) on the PDP-11. 
This means that code attempting to constmct strings out of character constants, 
or attempting to use characters as indices into arrays, must be looked at with 
great suspicion, lint is of little help here, except to option multi-character 
character constants. 

Of course, the word sizes are different! This is less troublesome than might be 
expected, however. The main problems are likely to arise in shifting or masking. 
C now supports a bit-field facility, which can be used to write much of this code 
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in a reasonably portable way. Frequently, portability of such code can be 
enhanced by slight rearrangements in coding style. Many of the incompatibili¬ 
ties seem to have the flavor of writing: 

X &= 0177700 ; 

to clear the low order six bits of x. This suffices on the PDP-11, but fails badly 
on GCOS and IBM. If the bit field feature cannot be used, the same effect can be 
obtained by writing: 

X &= - 077 ; 

which will work on all these machines. 

The right shift operator is arithmetic shift on the PDP-11, and logical shift on 
most other machines. To obtain a logical shift on all machines, the left operand 
can be typed unsigned. Characters are considered signed integers on the 
PDP-11, and unsigned on the other machines. This persistence of the sign bit 
may be reasonably considered a bug in the PDP-11 hardware which has infiltrated 
itself into the C language. If there were a good way to discover the programs 
which would be affected, C could be changed; in any case, lint is no help here. 

The above discussion may have made the problem of portability seem bigger 
than it in fact is. The issues involved here are rarely subtle or mysterious, at least 
to the implementor of the program, although they can involve some work to 
straighten out. The most serious bar to the portability of system utilities has been 
the inability to mimic essential system functions on the other systems. The ina¬ 
bility to seek to a random character position in a text file, or to establish a pipe 
between processes, has involved far more rewriting and debugging than any of 
the differences in C compilers. On the other hand, lint has been veiy helpful 
in moving the operating system and associated utility programs to other 
machines. 

5.16. Shutting lint Up There are occasions when the programmer is smarter than lint. There may be 

valid reasons for ‘illegal’ type casts, functions with a variable number of argu¬ 
ments, etc. Moreover, as specified above, the flow of control information pro¬ 
duced by lint often has blind spots, causing occasional spurious messages 
about perfectly reasonable programs. Thus, some way of communicating with 
lint, typically to shut it up, is desirable. 

The form which this mechanism should take is not at all clear. New keywords 
would require current and old compilers to recognize these keywords, if only to 
ignore them. This has both philosophical and practical problems. New prepro¬ 
cessor syntax suffers from similar problems. 

What was finally done was to make lint recognize a number of words when 
they were embedded in comments. This required minimal preprocessor changes; 
the preprocessor just had to agree to pass comments through to its output, instead 
of deleting them as had been previously done. Thus, lint directives are invisi¬ 
ble to the compilers, and the effect on systems with the older preprocessors is 
merely that the lint directives don’t work. 
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The first directive is concerned with flow of control information; if a particular 
place in the program caimot be reached, but this is not apparent to lint, this can 
be asserted by placing the directive 

/*NOTREACHED*/ 

just before that spot in the program. The -v option can be turned on for one 
function by the directive: 

/*ARGSUSED*/ 

Complaints about variable numbers of arguments in calls to a function can be 
turned off by the directive: 

/*VARARGS*/ 

preceding the function definition. In some cases, it is desirable to check the first 
several arguments, and leave the later arguments unchecked. This can be done 
by following the VARARGS keyword immediately with a digit giving the number 
of arguments which should be checked; thus, 

/*VARARGS2*/ 

checks the first two arguments and leaves the others unchecked. Finally, the 
directive: 

/*LINTLIBRARY*/ 

at the head of a file identifies this file as a library declaration file; this topic is 
worth a section by itself. 

5.17. Library Declaration lint accepts certain library directives, such as: 

Files T 

-ly 

and tests the source files for compatibility with these libraries. This is done by 
accessing library description files whose names are constructed from the library 
directives. These files all begin with the directive: 

/*LINTLIBRARY*/ 

which is followed by a series of dummy function definitions. The critical parts 
of these definitions are the declaration of the function return type, whether the 
dummy function returns a value, and the number and types of arguments to the 
function. The VARARGS and ARGSUSED directives can be used to specify 
features of the library functions. 

lint library files are processed almost exactly like ordinary source files. The 
only difference is that functions which are defined in a library file, but not used in 
a source file, draw no complaints, lint does not simulate a full library search 
algorithm, and complains if the source files contain a redefinition of a library rou¬ 
tine (this is a feature!). 
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By default, lint checks the routines it is given against a standard library file, 
which contains descriptions of the programs which are normally loaded when a C 
program is run. When the —p option is in effect, another file is checked contain¬ 
ing descriptions of the standard I/O library routines which are expected to be 
portable across various machines. The -n option can be used to suppress all 
library checking. 


5.18. 



Considerations When 
Using lint 


lint was a difficult program to write, partially because it is closely connected 
with matters of programming style, and partially because users usually don’t 
notice bugs which cause lint to miss errors which it should have caught. By 
contrast, if lint incorrectly complains about something that is correct, the pro¬ 
grammer reports that immediately! 

A number of areas remain to be further developed. The checking of structures 
and arrays is rather inadequate; size incompatibilities go unchecked, and no 
attempt is made to match up structure and union declarations across files. Some 
stricter checking of the use of typedef is clearly desirable, but what checking 
is appropriate, and how to carry it out, is still to be determined. 

lint shares the preprocessor with the C compiler. At some point it may be 
appropriate for a special version of the preprocessor to be constructed which 
checks for things such as unused macro definitions, macro arguments which have 
side effects which are not expanded at all, or are expanded more than once, etc. 

The central problem with lint is the packaging of the information which it col¬ 
lects. There are many options which serve only to turn off, or slightly modify, 
certain features. There are pressures to add even more of these options. 

In conclusion, it appears that the general notion of having two programs is a good 
one. The compiler concentrates on quickly and accurately turning the program 
text into bits which can be run; lint concentrates on issues of portability, style, 
and efficiency, lint can afford to be wrong, since incorrectness and over¬ 
conservatism are merely annoying, not fatal. The compiler can be fast since it 
knows that lint will cover its flanks. Finally, the programmer can concentrate 
at one stage of the programming process solely on the algorithms, data stmctures, 
and correctness of the program, and then later retrofit, with the aid of lint, the 
desirable properties of universality and portability. 


5.19. lint Options 



The lint command currently has the form 

r --—----------------— 

tutorial% lint [-abchnpsuvx ] filename, . . library-descriptors. 

The options are 

a Report assignments of long to lint or shorter 
b Report uiueachable break statements 
c Complain about questionable casts 
h Perform heuristic checks 
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n Do not do library checking 
p Perform portability checks 
s Same as h (for historical reasons) 
u Don’t report unused or undefined externals 
V Don’t report unused arguments 
X Report unused external declarations 
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Performance Analysis 


Tools discussed in this chapter cover facilities for timing programs and getting 
performance analysis data. Some tools work only with the C progr ammin g 
language, while others will work on modules written in any language. Perfor¬ 
mance analysis tools provide a variety of levels of analysis from very simple tim¬ 
ing of a command down to a statement-by-statement analysis of a program. You 
can select which level of granularity you like depending on the amount of detail 
and optimization you wish to perform. Here are the performance analysis tools 
available from the simplest to the most detailed: 

time A simply command (built in to the C shell) to display the time that a 
program takes. The C shell’s built in time command display statis¬ 
tics about how a command uses the system resources as well as just 
the raw time consumed. 

prof Generates a profile for the modules in a program, showing which 
modules are using the time. 

gprof Generates not only a profile as for prof, but also generates a call 

graph showing what modules call which, and which modules are 
called by other modules. The call graph can sometimes point out 
areas where removing calls can speed up a program. 

tcov Generates a detailed statement-by-statement analysis of a C pro¬ 
gram. 

6.1. t ime — Display Time Two distinct versions of the time command exist in the Sun system. Here we 

Used by a Program discuss the time command that is built in to the C shell. The other time com¬ 

mand is a program (in /bin/time) that you get when you use the Bourne shell. 

As a first example, we show the time command being used to display statistics 
on the run-time of the index. assist program we’ve used in other examples 
in this manual. In all the examples shown here we direct the output from 
index. assist into /dev/null. Here is the simplest example of using 
time: 

f -^ 

tutofial% time index.assist < index.entries > /dev/null 
13.5u O.Ss 0:15 92% 3+19k 19+lio Opf+Ow 
v—-----................^ 
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Now to explain the items in the display from the time command above. 

The 13.5u means that this program used 13.5 seconds of user time — time spent 
in the application program itself. The 0.8s meaiK that the program spent 0.8 
seconds in die system — this is time spent in the operating system kernel on 
behalf of the program. The third field is the elapsed or wallclock time for the 
application. The percentage figure is the percent of the user and system time as a 
fraction of the elapsed time. The rest of the display is of lesser interest just now 
and is explained in more detail below. 

Effects of Optimizer on Just for the sake of interest, let’s see what effect the C optimizer has on the mn 

Timing time of this program ■— we make the program with the -0 option and see what 

happens: 

f -----—-^ 

tutorial% time index.assist < index.entries >■ /dev/null 
13.lu 1.43 0:3$ 37% 3+l9k 13+Oio lp£+0w 

v. . .> 

What has happened here? The optimized version takes longer to ran! This 
demonstration tells us that simple timing is not so simple after all — in a multi¬ 
tasking system there are many other factors that can effect the simple timing. 

Note that the user time for the program is actually slightly less — 0.4 seconds 
less. But, the system time and the elapsed time are very different. These timings 
are affected by the load on the system. If we look at the last field in the time 
display, note that in the unoptimized version there were zero page faults, while in 
the optimized version there was one page fault. This is an indication that there 
was other activity in the system at the time the program was ran and this other 
activity will adversely affect the elapsed time. There are two rales you can apply 
to this situation: 

□ Run such timing tests on a quiet system late at night. Make sure that ‘late at 
night’ is not midnight when a whole bunch of cron daemons start up. 

□ Run timing tests several times and take averages. 

Controlling the display from The time command built into the C shell has the capability of altering the infor- 

the time Command mation displayed under control of an environment variable. This is not true of 

/bin /t ime — the command you’d have to use if you were using the Bourne 
shell. Here is how to set up the time variable to control the time display. 

You can control how the C shell times programs by setting the time variable in 
your . login or . cshrc file. 

The time variable can be supplied with one or two values, such as 
set time=3brset time=(3 "%E %P%"). 

Setting the time variable via a set.command of the form: 

set time=nrtn 

means that the shell displays a resource-usage summary for any command run¬ 
ning for more than nnn CPU seconds. 



microsystems 


Revision A of 9 May 1988 




Chapter 6 — Performance Analysis 85 



Control Key Letters for the 
time Command 


The second form controls exactly what resources are displayed. The character 
string can be any string of text with embedded control key-letters in it. A control 
key-letter is a percent sign (%) followed by a single upper-case letter. To print a 
percent sign, use two percent signs in a row. Unrecognized key-letters are sim¬ 
ply printed. The control key-letters are: 


Table 6-1 Control Key Letters for the time Command 


Letter 

Description 

D 

Average amount of unshared data space used in Kilobytes. 

E 

Elapsed (wallclock) time for the command, 

F 

Page faults. 

I 

Number of block input operations. 

K 

Average amount of unshared stack space used in Kilobytes. 

M 

Maximum real memory used during execution of the process. 

0 

Number of block output operations. 

P 

Total CPU time — U (user) plus S (system) — as a percentage 
of E (elapsed) time. 

S 

Number of seconds of CPU time consumed by the kernel on 
behalf of the user’s process. 

U 

Number of seconds of CPU time devoted to the user’s process. 

W 

Number of swaps. 

X 

Average amount of shared memory used in Kilobytes. 



Default Timing Summary The default resource-usage summary is a line of the form: 

uuu.u u sss.s s ee:ee pp % xxx +dddk. Hi -¥000 io mmm pf+vtw w 


Table 6-2 Default Timing Summary Chart 


Field 

Description 

UUU.U 

user time (U), 

sss.s 

system time (S), 

ee:ee 

elapsed time (E), 

pp 

percentage of CPU time versus elapsed time (P), 

XXX 

average shared memory in Kilobytes (X), 

ddd 

average unshared data space in Kilobytes (D), 

Hi and ooo 

the number of block input and output operations respec¬ 
tively (I and O), 

mmm 

number of page faults (F) 

ww 

number of swaps (W). 


C shell time Command 
_ J versus /bin/time 


One final note on the time commands. As mentioned previously, there are two 
versions of time: the one built in to the C shell as described above, and the ori¬ 
ginal Bourne shell time corrrmand which can be found in /bin/time. 
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The C shell time command does not time a command which is a component of 
a pipeline. This is what happens: 



tutorial% echo timing a pipeline t 

time cat , 



timing a pipeline 


A 

whereas the Bourne shell time command gives completely different results: 


tutorial% echo timing a pipeline t 
timing a pipeline 

/bin/time cat 



0-8 real 0-0 user 

0,1 sys 



6.2. prof — Generate After simple timing, a profile of a program displays a finer level of analysis to 

Profile of a Program assist in optimizing performance. Getting a profile is the next step after simple 

timing — more detailed analysis is provided by the call-graph profile and the 
code coverage tools described later in this chapter. 


Taking the index. assist program from before as an example, let’s make the 
program compiled for profiling. To compile a program for profiling, you use the 
-p option to the C compiler: 


• ' ' . . . 

tutorial% make CFliAGSw-p 

A 

m^S&ages from the make command 


^ -. . . 

. ^ 


Now we can run the index.assist program as before. When a program is profiled, 
the results appear in a file called mon. out at the end of the run. Every time you 
run the program a new mon. out file is created, overwriting the old version. 

You then use the prof command to interpret the results of the profile, as shown 
by the example below. 
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/..—’— .. ■— —- 

tutorial^ index.assist 

< index.entries > /dev/null 

- ^ 

tutorial% prof 

index.assist 



%time 

cumsecs 

fcall 

ms/call 

name 


19.4 

3.28 

11962 

0.27 

compare strinas 


15.6 

5.92 

32731 

0.08 

strlen 


12.6 

8.06 

4579 

0.47 

doprnt 


10,5 

9.84 



mcount 


9.9 

11.52 

6849 

0.25 

get field 


5.3 

12.42 

762 

1.18 

_fgets 


4.7 

13.22 

19715 

0.04 

^strcmp 


4.0 

13,89 

5329 

0,13 

malloc 


3.4 

14.46 

11152 

0.05 

insert index entry 


3.1 

14.99 

11152 

0.05 

^compare entry 


2.5 

15.41 

1289 

0.33 

Imodt 


0.9 

15.57 

761 

0.21 

_get_index_terras 


0.9 

15.73 

3805 

0.04 

__strcpy 


0.8 

15.87 

6849 

0.02 

^skip_^3pace 


0.7 

15.99 

13 

9.23 

_read 


0.7 

16.11 

1289 

0.09 

Idivt 


0.6 

16.21 

1405 

0.07 

_p>rint_index 



everything else is insignijicant 











This display points out that most of the program’s running time is spent in the 
routine that compares character strings to establish the correct place for the index 
entries, and that after that, the majority of the time is spent in the _strlen 
library routine — to find the length of a character string. If we wish to make any 
appreciable improvements to the program we must concentrate our efforts on the 
compare_strings function. 

Interpreting Profile Display Let’s interpet the results of the profiling run though. The results appear under 

these column headings: 

%time cumsecs #call ms/call name 
Here’s what the columns mean: 

% t ime Percentage of the total run time of the program, that was consumed 

by this routine. 

cumsecs A runmng sum of the number of seconds accounted for by this func¬ 
tion and those listed above it. This information isn’t really worth 
much — the important data comes from the percentage of total time 
and from the time consumed per call. 

#cal 1 The number of times this routine was called. 
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ms/call How many milliseconds this routine consumed each time it was 
called. 

name The name of the routine. 

Now what advice can we derive from the profile data? Notice that the 
compare_st rings function consumes nearly 20% of the total time. To 
improve the run time of index .assist we must either improve the algorithm 
that compare_st rings uses, or we must cut down the number of calls. Not 
obvious from the flat profile is the information that compare_strings is 
heavily recursive — we get that fact from using the call graph profile described 
below. In this particular case, improving the algorithm also implies reducing the 
number of calls. 

While the flat profile described in the last section can provide valuable data for 
performance improvements, sometimes the data obtained is not sufficient to point 
out exactly where the improvements can be made. A more detailed analysis can 
be obtained by using the call graph profile that displays a list of which modules 
are are called by other modules, and which modules call other modules. Some¬ 
times, removing calls altogether can result in performance improvements. 

Compiling with the -pg Using the same index. assist program an example, let’s make the program 

Option compiled for call-graph profiling. To compile a program for call-graph profiling, 

you use the -pg option to the C compiler: 


r 

tutorial% make CFIAGS-^pg 


messages from the make command 


< ....•_____ 



Now we can run the index.assist program as before. When a program is call- 
graph profiled, the results appear in a file called gmon. out at the end of the mn. 
You then use the gprof command to interpret the results of the profile: 


^ ~ ™ . i ^ .—.\ 

tutorial% index.assist < index.entries >■ /dev/null 
tutorial% gprof index.assist 

voluminousmtpuffrom the gprof command 


Output from gprof The output from gprof is really voluminous — it’s usually intended that you 

take the summaries away and read them later. The output from gprof consists 
of the two major items listed below. 


6.3. gprof — Generate a 
Call Graph Profile 
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D The flat profile. This is similar to the summary that the prof command 
supplies, gprof gives you slightly more information. The output from 
gprof contains an explanation of what the various parts of the summary 
mean, so you don’t need to go look the things up in a manual. 

n The full call-graph profile. There are some fragments of the output from the 

profiling run just below with some examples of how to interpret them. 

The output from gprof contains an explanation of what the various parts of the 
summary mean, so you don’t need to go look the things up in a manual. 


Interpreting Call Graph Here is a fragment of die output from the gprof summary. Most of the output 

has been deleted from before and after the fragment. One thing that gprof does 
tell you is the granularity of the sampling: 


granularity: each sample hit covers 4 byte(s) for 0.14% of 14.74 seconds 
V____ 


Then comes part of the call-graph profile itself: 


/- 

index 




called/total 

parents 

-^ 

%tiine 

self 

descendents 

called+self 

name index 






called/total 

children 




0.00 

14.47 

1/1 

start [1] 


[2] 

98.2 

0.00 

14.47 

1 

main [2] 




0.59 

5.70 

760/760 

insert index entry [3] 




0.02 

3.16 

1/1 

_print_index [6] 




0.20 

1.91 

761/761 

get index terms [11] 




0.94 

0.06 

762/762 

_fgets [13] 




0.06 

0.62 

761/761 

get page number [18] 




0.10 

0.46 

761/761 

_get_page_type [22] 




0.09 

0.23 

761/761 

_skip_start [24] 




0.04 

0.23 

761/761 

_get_index_type [26] 




0.07 

0.00 

761/820 

_insert_page__entry [34] 






10392 

_insert_index__entry [ 3 ] 




0.59 

5.70 

760/760 

_main [2] 


[3] 

42.6 

0.59 

5.70 

760+10392 

_insert_index_entry [3] 




0.53 

5.13 

11152/11152 

_comparecentry [4] 




0.02 

0.01 

59/112 

_free [38] 




0.00 

0.00 

59/820 

_insert_page_entry [34] 






10392 

_insert_index_entry [3] 


\- 
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Noting that there are 761 lines of data in the input file to the index. assist 
program, here are some of the things we can determine from the call graph: 

o f ge t s is called 762 times — one more than the number of lines in the input 

file. The last call to f get s returns an end-of-file. 

□ The insert_index_entry function is called 760 times from main — 
one less times than the number of lines. Why is this? The first index entry 
is inserted ‘manually’ in the main function when there are no previous 
index entries to insert. 

□ Note that in addition to the 760 times that insert_index_entry is 
called from main, insert_index_entry also calls itself grand total 
of 10392 times — insert_index_entry is heavily recursive. Index 
entries appear in the input file in unsorted order and are sorted on the fly by 
inserting them into a binary tree. 

o Note also that compare_entry (which is called from 

insert_index_entry) is called 11152 times, which is equal to 
760+10392 times, so there is one call of compare_entry for every time 
that insert_index_entry is called. This is as it should be. If there 
was a discrepancy in the number of calls, we might suspect some problem in 
the program’s logic. 

o Notice the number of calls to the insert_page_ent ry and free () 
functions — insert_page_entry is called 820 times in total: 761 times 
from main while the program is building index nodes, and then 
insert_page_entry is called 59 times from 
insert_index_entry. This indicates that there are 59 index entries 
that are duplicated, so their page number entries are linked into a chain widi 
the index nodes. The duplicate index entries are then freed, hence the 59 
calls to free (). 

6.4. t cov — Statement- After a certain level of performance enhancements have been made, the profile 

Level Analysis data obtained from a program starts to look ‘flat’ and the granularity of the data 

collection makes further improvements difficult. At this point, you can use a tool 
that performs statement-by-statement analysis on a program, showing which 
statements are executed and how many times. This facility is called code cover¬ 
age. 

Code coverage can also be valuable in identifying areas of ‘dead’ code — areas 
of code that never get executed. Code coverage can also point out areas of code 
that are not being tested. 

Compiling with the -a Option Using the same index. assist program an example, let’s make the program 

compiled for code coverage. To compile a program for code coverage, you use 
the -a option to the C compiler, as shown by the example below. 
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... mm.. ...... 

tutorial% make CFLAGS=-a 


messages from the make cotmmAd 


For eveiy thing . c file you compile with the -a option, the C compiler generates 
a thing. d file — these are used by the code coverage program later in the 
analysis. 

Now we can run the index.assist program as before. After a program has been 
run, you can then mn t cov to get the summaries of execution counts for each 
statement in the program: 


tutorial% index.assist < index.entries > /dev/null 
tutorial^ teov *.c 


Now, for every thing. c file you specify, tcov uses the thing. d file and gen¬ 
erates a thing .tcov file containing and annotated listing of your code. The list¬ 
ing shows the number of times each source statement was executed. At the end 
of each thing .tcov file there is a short summary. 

Below is a small fragment of the C code from one of the modules of 
index. assist — the module in question is the insert_index entry 
function that’s called so recursively. ~ 
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struct index_entry * 
insert_index__entry (node, entry) 


struct 

index__entry 

*node; 

struct 

index_entry 

*entry; 

int 

result; 


int 

level; 


result 

= compare entry(node, entry); 




if (result == 0) { /* exact match */ 

/* Place the page entry for the duplicate */ 
/* into the list of pages for this node */ 

59 ~> insert_page_entry (node, entry->page_entry) ; 

free(entry); 
return(node); 

} 


11093 -> 

3956 -> 
3626 -> 

330 -> 


7137 -> 
6766 -> 

371 -> 


} 


if (result > 0) /* node greater than new entry — */ 

/* move to lesser nodes */ 
if (node->lesser != NULL) 

insert_index_entry(node->lesser, entry); 

else { 

node->lesser - entry; 
return (node~>lesser); 

} 

else /* node less than new entry — */ 

/* move to greater nodes */ 
if (node->greater != NULL) 

insert_index_entry(node->greater, entry); 

else { 

node->greater ~ entry; 
return (node->greater); 

} 


J 


Notice that the insert_index_entry function is indeed called 11152 times 
as we determined in the output from gprof. The numbers to the side of the C 
code show how many times each statement was executed. 

tcov Summary Below is the summary that tcov placed at the end of build. index. tcov. 


n 
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Top 10 

Blocks 

Line 

Count 

240 

21563 

241 

21563 

245 

21563 

251 

21563 

250 

21400 

244 

21299 

255 

20612 

257 

16805 

123 

12021 

124 

11962 


77 Basic blocks in this file 

55 Basic blocks executed 

71.43 Percent of the file executed 

439144 Total basic block executions 
5703.17 Average executions per basic block 
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u 


The Source Code Control System (SCCS) is a collection of commands that con¬ 
trol changes to selected files, such as the source files for programs and software 
projects. SCCS allows you to: 

1. Place a file under the control of SCCS. Once a file is under SCCS control, 
copies of any subsequent version can be extracted from a history file. 

2. Check a file out for editing and lock it, so that only you can make changes. 

3. Check in a new version of the file that incorporates your changes. When you 
check a file in, you can also supply comments that summarize the changes 
you’ve made. 

4. Back out your changes if necessary. 

5. Inquire about the status or current version of a file. 

6. Inquire about the line-by-line differences between versions. 

7. Inquire about the version history, including a record who checked in which 
changes, and when they did so. 

Collectively, functions such as these are referred to as version control. They are 
important in situations where source files are updated frequently, perhaps by 
more than one person, or where files need to be audited. SCCS allows you to 
recover the current, or any previous version of a file, as needed. It reduces the 
amount of data that must be kept on disk by recording only the differences 
between successive versions. With this information, SCCS can reconstruct the 
initial version, the current version, or any version in between. 


Low-Level SCCS Commands The Source Code Control System, or SCCS, consists of a set of low-level com- 
vs. the sees Command mands to perform individual functions, and a high-level front-end-command 

called sees. 

The sees command provides a reasonable and consistent user interface to the 
various and sundry low-level commands. Although they can be used directly, 
the low-level commands are more difficult to work with. The remainder of this 
chapter describes the high-level sees command. Refer to Appendix A, SCCS 
Low-Level Commands, for information about the SCCS low-level commands. 
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Conventions 


Throughout this chapter, we assume that you are using the C shell on a system 
called ‘tutorial’, and so the hostname is shown followed by the % prompt in the 
examples. What you type is shown in bold typewriter text like 
this, and the system’s responses are shown in typewriter font like 
this. 


f ^— 

ttitorial% decs create progf.c 

- ^ 

prog.c: 



87■lines 


tutorial^ 


^..-. 

.'.J 


A record of each version of your source file, along with die version log and other 
information, is kept in a history file. This history file is also called an s.jile ( “s- 
dot-file” ). The illustration below shows the four basic version-control opera¬ 
tions provided by sees, and how they effect the history file. 


Figure 7-1 Basic sees Subcommands 



As the picture illustrates, there are four basic sees subcommands that operate 

on the s.file; 

□ create the history file. 

□ edi t, or check out a file for editing. This operation extracts a version of the 
file that is writable only by you, and locks the history file so that no one else 
can get an editable copy. 

□ delta (merge) changes that you’ve made back into the s.file. This is the 
complement to the sees edit operation. Line-by-line differences (see 
dif f (1)) are recorded in the history file; the set of differences associated 
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Backing Out Pending Qianges 


7.1. Terminology 
S.File, or History File 


SCCS-File 

Deltas 


with a given version of the file is called a delta. A new version number is 
assigned, and you are prompted for your comments, which are added to the 
header along with other information about the new version. 

□ get a read-only copy of the file. This operation extracts, or gets a version 
of the file from the s.file. By default, a read-only copy of the latest version is 
retrieved. The read-only copy is mtended to be static, can be used as a 
source file for compilation, printing, or whatever—^it is specifically not 
intended to be edited or changed in any way. (Attempting to bend the mles 
by changing permissions of an extracted version, and then editing it, can 
result in your changes being lost. If you want to edit a file under SCGS con¬ 
trol, check it out using sees edit.) 

The s.file is the final audiority and archive for whatever SCCS-file you are work¬ 
ing with. The version you get using either sees get or sees edit is 
merely a copy derived from data in that file; if deleted, the current version can be 
gotten once again from the history file. Of course, if you have a file checked out 
for editing, you must take care to check in any changes you wish to incorporate. 

Changes that have been made to a checked-out version, but ate not yet checked 
in, are said to be pending. You can use the sees unedlt command to back 
out changes that are still pending. This comes in handy if a version of the file 
should become damaged during editing. The unedih subcommand removes the 
checked-out version, unlocks the history file, and gets a read-only copy of the 
most recent version checked in. In other words, after using unedit, it is as if 
you hadn’t checked the file out in the first place. 

There are a number of terms worth learning before going any farther. 

The s.file is the history file or archive containing die information needed to get 
any desired version of a file; it contains only the original version, and a record of 
the differences between versions, rather than the entire text of each version. This 
saves disk space, since there is no need to duplicate the lines that haven’t 
changed between versions, and it allows selective changes to be removed later. 
The s.file also includes some header information for each version, including com¬ 
ments that a user provides when checking in each version. 

A file under SCCS control is sometimes referred to as an SCCS-file. In some 
cases, the history file is also referred to in this way. The context in which the 
term is used usually makes clear which is meant 

A delta is a set of line-by-line differences associated with a given version of the 
file. A delta only includes the specific changes made between two successive 
versions. Normally, extracted versions reflect all deltas made earlier. However, 
it is possible to get a version that omits selected deltas associated with specific 
(earlier) versions. 
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SIDs, or Version Numbers An SID, or SCCS-ID, is a number fliat represents a delta. This is normally a two- 

part number, starting with 1 . 1 composed of of a release number, and a level 
number. The level number is incremented with each new version. Note that the 
version number is only associated with the particular set of differences between 
two successive versions of a file. It does not represent the cumulative set of 
changes fi'om the original. However, since versions are normally extracted so as 
to reflect the accumulated changes, the SID of the most recent delta is often used 
to represent the version of the file that it corresponds to. 

The release number is normally carried forward between versions; it is possible 
to alter the release number using low-level commands. 

ID keywords SCCS recognizes and expands certain keywords of the form: 

%X% 

where X is an upper case letter. These ID Keywords can be used to introduce the 
current version number, as well as other information, into the read-only 
(extracted, but not checked out for editing) versions of the file. For instance, 

% I % expands to the SID of the most recent delta checked in. % W% includes the 
filename, the SID, and the (rather unique) string @ (#), which is recognized by 
certain SCCS commands, and makes the expanded SID easy to search for. The 
%G% keyword expands to the date of the latest delta. Other ID keywords are 
listed in Appendix A, \in6et Identification Keywords. 

For example, a line such as; 

f ------V 

static char Sccsld[ ] = "%W%\t%G%"; 

---- > 

will be replaced with something like: 

f — _ 

static char Sccsld[ ] = "@(#)prog.c 1.2 08/29/80"; 


This tells you the name and version of the source file and the time the delta was 
created. The string @ (#) is a special string that signals the beginning of an 
expanded SCCS-ID keyword. 

When you check out a file for editing, the ID keywords are not expanded’, they 
are only expanded when you get a read-only version. If a version of a file with 
the keywords already expanded should happen to be checked in, version- 
dependent information is no longer updated automatically because the unex¬ 
panded keywords are replaced with text. To alert you to this simation, if an 
SCCS command finds no ID keywords in a version being checked in, it give you 
the warning: 

No Id Keywords (cm7) 

Note that this does not prevent the file from being checked in or out. 

ID keywords can be inserted anywhere in a file, and are typically inserted in 
comments. They can also be compiled into an object file, as shown in the exam¬ 
ple below. 
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static char Sccsld[ ] = "%W% %G%"/ 


While this allows the version information to be compiled into the object file, it 
also takes up data space when the program is run. If you use this technique to 
put ID keywords into header files, it is important to use a different variable for 
each header. Otherwise, you will get an error when the compiler attempts to 
redefine the variable. However, if a header file is included by many files that are 
subsequently loaded together, the version information for that file nwy be 
included in the object file several times; you may find it more to your taste to 
place the ID keywords for header files within co mments ; 


/* %W% %G% */ 


7.2. Creating SCCS History 
Files with sees 
ereate 


To put a set of source files under SCCS control, you must: 

□ Make a subdirectory called SCCS, if it isn’t there already (note that SCCS is 
in upper-case, so that will appear near the top of an 1 s directory listing): 


tutorial% wkclir SCCS 


□ Use the sees ereate command to create the history files for each source 
file. Suppose that you want to have all your .c and .A files under SCCS con¬ 
trol: 

tutorial% sees ereate *.[eh]^1 


For each filename argument you supply, the sees create command: 

creates a file called s,filename in the SCCS subdirectory, 

renames Q&ch filename by placing a comma in front of the name, so that you 

end up with files of the form filename. 

gets (extracts) a read-only copy of each filename by using the s cc s 

get command. 

After verifying that secs has correctly created the s.files, you can remove the 
filenames starting with a comma. 

If you want to embed ID keywords in the files, it is best to put them in before you 
create the s.files. If you do not, create will print the warning messa^: 

No Id Keywords (cm7) 


You can add the keywords in the same way that you would make any other 
change to the file. Check it out for editing using secs edit, add the key¬ 
words, and check it back in using secs delta. 
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To get a copy of the most recent version of a file, use the command: 

sees get filename... 

For example, the command: 






tutorial% secs get prog.c 


L_ 


J 


sees responds with the version number, and the number of lines extracted: 


7.3. Extracting Current 
Versions with sees 
get 


1.1 

87 lines 


meaning that a version containing cumulative deltas through 1.1 has been 
retrieved, and that it contains 87 lines. The file prog. e is created in the current 
directory. It’s permissions are set to read-only, which indicates that no one has it 
checked out for editing. 

This copy of the file should not be changed, since sees cannot merge the 
changes back into the s.file unless the file has been checked out. If you do 
manage to force in some changes, those changes may well be lost the next time 
someone does an sees get, or sees edit. 

7.4. Changing Files To change a version of a file, you must obtain a copy of the file that can be 

(Creating Deltas) edited. You obtain such a copy using sees edit as shown below. Having 

made the changes and satisfied yourself that the changes are correct, you can then 
merge the changes back into the sees history file using sees delta also 
shown below. 


Retrieving a File for Editing 
with sees edit 


To edit a source file, you must first get it, requesting permission to edit it^. The 
' response will be the same as with sees get except that it also says that a new 
delta is being created: 


< -^- 

-V 

tutorial% aces edit prog.c 


New delta 1,2 



J 


You can then edit it, using a text editor: 


. .... 

tutoriai% vi prog*c 


^_ 



Merging Changes Back Into 
the S.File with sees delta 


When the desired changes have been made, you can put your changes into the 
SCCS-file using the delta command: 

-—-- 

tutorial% sees delta prog.e 


''The sees edit command is equivalent to using the -e option to secs get. 
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Delta prompts you for ‘comments?’ before merging the changes in. At this 
prompt you should type a one-line description of what the changes mean (more 
lines can be entered by ending each line except the last with a backslash). Delta 
then types: 


^—'■ ‘ ‘ ‘ .. . 1 1.1 ^— 

iiliiiiiiiliii* 


5 inserted 


3 deleted 


&4 uitchanged 


V- 

---^ 


saying that delta 1.2 was created, and it inserted five lines, removed three lines, 
and left 84 lines unchanged^. The prog. c file is then removed; it can be 
retrieved using sees get. 


If you give several filename arguments to delta, they will all be checked in 
with the same comment. 

Version Control for Binary Although sees is typically used for source files that contain ASCII text, this ver- 

sion of sees allows you to apply version control to binary files (files that contain 
NULL or control characters, or do not end with a INEWIINR I t. The binary files 
are encoded^ into an ASCII representation as they are checked in, and versions 
are decoded as they are extracted. 

Version control functions can be useful for data files, such as icons, raster 
images, or screen font tables, that you may wish to edit and track. For instance, 
you might want to track changes to a screen font, which can be created and main¬ 
tained using f ontedit(l). When you create or delta a binary file such as 
this, you get the warning message: 

Not a text file (ad31) 

You may also get the: 

No id keywords (cm7) 

message. Otherwise, everything proceeds normally, as shown by the example 
below. 


^ Changes to a line are counted as a line deleted and a new line inserted. 
® See uuencode(lC) for details. 
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When Making Deltas 


— -— -.. . ..— -^—.— --— ' ■ 

tutorial% sees create special.font 

special.font: 

Not a text file {ad31) 

No id keywords (cm7) 

No id keywords (cm7) 
tutorial% Secs get special.font 

tutorial% file special.font SCCS/s.special.font 
special.forjt: vfont definition 

SCCS/s.special.font: secs 

v_:_ J 

You can use sees ereate -b to force sees to treat a file as a binary file as 
opposed to a text file. 

Since binary files (and their encoded representation) can vary significantly 
between versions, their history files tend to grow at a much faster rate than text 
file histories. In fact, when it comes to archiving object files and executables, it 
can take less disk space simply to store each version of the file as is. Fortunately, 
those files are not normally edited, so they don’t really require version control. 
Using sees to control the source files from which an object file is built, and 
using make to build it in a consistent marmer, is a more practical method for 
maintaining object files and executable programs. 

A little forethought helps when deciding whether to check in a file. Making a 
new delta after every single edit during the debugging phase, for instance, can get 
to be excessive. On the other hand, leaving a file checked out for so long that 
you forget about it can be very inconvenient for someone else who may need to 
edit it later. 

So long as you are certain that you are the only one who requires access to the 
file, it makes sense to complete a set of related changes before checking the file 
back in. 

When you provide comments for a delta, it is important to make them meaning¬ 
ful. You may have to return to the file several months later, at which time a use¬ 
ful summary of what you’ve done in each delta will be a big help. Numerous 
marginal deltas with meaningless comments such as: 

“fixed compilation problem in previous delta,’’ or, “fixed botch in 1.3.’’, are sel¬ 
dom helpful or welcome. 

It is very important to check in all changes before compiling or installing a 
module for general use. A good technique is to edit the files you need, make 
all necessary changes and tests, compile and debug the files repeatedly until you 
are satisfied, and make a delta. After making the delta, it is a good idea to get 
the files, and then recompile and/or install the finished versions. 
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Finding Out What’s Going On To find out what files are being edited, type: 
with sees info . i ,. _____ 

tutorial% aces Info 


to display a list of all the files being edited and other information — such as the 
name of the user who did the edit. Also, the command: 


tutorial% sees check 


is nearly equivalent to the info command, except that it is silent if nothing is 
being edited, and returns a non-zero exit status if anything is being edited. It can 
thus be used in an ‘install’ entry in a makefile to abort the installation if anything 
has not been properly delta’ed. 


If you know that everything being edited should be delta’ed, you can use: 


tutorial^ secs delta 'secs teli' 


The tell command is similar to info except that only the names of files being 
edited are output, one per line. 


All of these commands take a —b option to ignore ‘branches’ (alternate versions, 
described later) and the —u option to give only files being edited by you. The —u 
option takes an optional user argument, giving only files being edited by that 
user. For example: 


tutorial% secs info -ujohn 


gives a listing of files being edited by user john. 


Finding Out What Versions 
Are Being Used with sees 
what 


To find out what version of a program is being run, use: 

sees what prog.e /usr/bin/prog 

which will print all strings it finds that begin with ‘0 (#) ’. This works on all 
types of files, including binaries and libraries, provided that the ID keywords 
have been compiled in. For example, the above command will output something 
like: 


—- — -. 



^ ‘ ‘ ‘ ‘ 

tutoriaI% sees 
prog.C: 

what 

prog.c /usr/bin/prog 


prog*c 

/u$r/bin/prQg: 

1.2 

08/29/80 


prog.C 

1.1 

02/05/79 

J 


From this one can see that the source in prog. c will not compile into the same 
version as the binary in. /usr/bin/prog. 



Revision A of 9 May 1988 





































106 Programming Utilities and Libraries 


Keeping SIDs Consistent 
Across Files 


Creating New Releases 


n 

With some care, it is possible to keep the SID’s consistent in multi-file systems. 

The trick here is to always sees edit all files at once. The changes can then 
be made to whatever files are necessary and then all files (even those not 
changed) ate delta’ed. This can be done fairly easily by just specifying the 
sees subdirectory as the filename argument to both edit and delta: 



With the delta subcommand, ate prompted for comments only once; the com¬ 
ment is applied to all files being checked in. 


To create a new release of a program, specify the release number you want to 
create when you check the file out for editing, using the -rn option to edit; n is 
the new release number: 



In this case, when the new version is delta’ed, it will be the first level delta in 
release 2, with SID 2.1. To change the release number for all SCCS-files in the 
directory, use: 







tiitorial% sees edit -r2 SCCS 

.^ 

_> 

o 


sees allows you to extract any previously checked-in version of a file. This can 
come in handy if you need to backtrack to an earlier version. In this case, you 
can check out the current version, extract a writable copy of an earlier “good” 
version (under a different name) using a command of the form: 

sees get -k -xSID -Gnewname filename 

and then move the old version to the given filename, and check the file back in. 

Getting a Delta by Date In some cases you don’t know what the SID of the delta you want is, but you do 

know the date on (or before) which it was checked in. You can extract the ver¬ 
sion of the file that was the last one checked in before the given date using the -c 
(cutoff) option. For example, 

^ - 

tutorial% sees get —c800722120000 prog.e 

retrieves whatever version was current as of July 22,1980 at 12:00 noon. Trail¬ 
ing components can be stripped off (defaulting to their highest legal value), and 
punctuation can be inserted in the obvious places; for example, the above line 
could be equivalently stated as: 

r ... "' . 

tutorial% sees get -c”80/07/22 12:00:00" prog.e 

V_i_^ 

Asun Revision A of 9 May 1988 
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Selectively Deleting Old 
Deltas 


Suppose that you later decided that you liked the changes in delta 1.4, but that 
delta 1.3 should be removed. You could do this by excluding delta 1.3: 

..... 

tutoirial% 3CC3 edi.'b px^o9’.c 

... 

When delta 1.5 is made, it will include the changes made in delta 1.4, but will 
exclude the changes made in delta 1.3. You can exclude a range of deltas using a 
dash. For example, if you want to get rid of 1.3 and 1.4 you can use: 


tutorial% sees edit: —xl.3—1,4 pro9r>c 


which will exclude all deltas from 1.3 through 1.4. Alternatively, 

. . .—-- 

tutorial% secs edit -xl.3-1 prog.c 

will exclude a range of deltas from 1.3 to the current highest delta in release 1. 

In certain cases when using -x (or -i — see below) there will be conflicts 
between versions; for example, it may be necessary to both include and delete a 
particular line. If this happens, sees always displays a message telling the 
range of lines affected; these lines should then be examined very carefully to see 
if the version sees got is ok. 

Since each delta (in the sense of ‘a set of changes’) can be excluded at will, it is 
most useful to put each semantically distinct change into its own delta. 


When you created a delta, you presumably gave a reason for the delta to the 
‘comments?’ prompt. To display these co mments later, use: 


tutorial^ secs prt prog.c 

^ _ 

which produces a report for each delta of the SID, time and date of creation, user 
who created the delta, number of lines inserted, deleted, and unchanged, and the 
comments associated with the delta. For example, the output of the above com¬ 
mand might be: 


/ - 

tutorial% sees pxt prog.c 

D 1.2 80/08/2& 12:35:31 

bill 2 1 

00005/00003/00084 

- \ 

removed "-q" option 

D 1.1 79/02/05 00:19:31 

eric 1 0 

O0O87/QQ0OO/OOOQO 


date and time created 80/06/10 

V 

00j19j31 by eric 


J 



7.6. Auditing Changes 
Displaying Delta Comments 
with sees prt 


Finding Why Lines Were To find out why you inserted lines, you can get a copy of the file with each line 

Inserted preceded by the SID that created it: 

tutorial% sees get prog.c 
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You can then find out what changes were made by this delta by printing the com¬ 
ments using prt. 

To find out what lines are associated with a particular delta, 1.3 for instance, use: 

. . .. . —. . . ' ' ' 

tutorial% scca get -m -p prog.c { girep ‘"'1.3' 

The —p option makes sees output the generated source to the standard output 
rather than to a file. 


Discovering What Changes 
You Have Made with sees 
dif fs 


When you are editing a file, you can find out what changes you have made using: 


r . ' . 

.A 

tutorial% sees diffs prog.c 


' _ 



Most of the options to dif f can be used. To pass the -e option to dif f, how¬ 
ever, use -C. You can also use the -r and -c options to compare the version 
being edited with an earlier checked-in version. 

To compare two checked-in versions, use: 

r -----\ 

tutolrial% secs scesdiff -rl.S -rl.6 prog.C 

to see the differences between delta 1.3 and delta 1.6. Again, most options to 
dif f can be used, as can the -c option of sees; for the -c dif f option, use 
-C. 


7.7. Shorthand Notations There are several sequences of commands that are used frequently, sees tries to 

make it easy to do these. 


Making a Delta and Getting a A frequent requirement is to make a delta of some file and then get that file. This 
File with sees delget is done by using 


r . 

.A 

tutorial% sees delget prog*e 


which is entirely equivalent to: 

tutorial% secs delta prog.c 

;:;:;:;:::::;:;::X;::X;:;::;:X::::::::;:::::;::X::;X;X;X:X:X::;::X 

tutorial% secs get prog.c 



.^ 


except that if an error occurs while making a delta of any of the files, none of 
them will be gotten. The secs deledit command is equivalent to 
secs delget except that the secs edit command is used instead of the 
secs get command; this is useful for checking in a set of changes while you 
continue editing. 


Replacing a Delta with the 

secs fix 


Frequently, there are small bugs in deltas, for instance, compilation errors, for 
which there is no reason to maintain an audit trail. To replace a delta, use: 

-^^ 

tutorial% sees fix —rl.4 prog.c 

v..... .’.: .... ^ 
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This gets a copy of delta 1.4 of prog.c for you to edit and then deletes delta 1.4 
from the SCCS-file. When you do a delta of prog.c, it will be delta 1.4 ag ain . 
The -r option must be specified, and the delta fiiat is specified must be a leaf 
delta, that is, no other deltas may have been made subsequent to the creation of 
that delta. 

Backing Out of an Edit with If you found you edited a file that you did not want to edit, you can back out by 
sees unedit using: 


tutorial% sees unedit prog.e 


Working From Other 
Directories 


7.8. Using sees on a 
Project 


If you are working on a project where the history files are in another directory, 
you may be able to simplity things by making a symbolic link to the tme SCCS 
subdirectory: 


tutorial% In -s /usr/sire/eind/SCCS SCCS 


With this method, you can get a separate set of source files in a location diat is 
more convenient. While in the working directory, you can also check files in and 
out—just as you could if you were in the original directory from which the his¬ 
tory files were created. 

To extract a complete set of duplicate sources, use the command sees get 
SCCS. 

Working on a project with several people has its own set of special problems. 

The main problem occurs when two people attempt to modify a file at the same 
time, sees prevents this by locking an s.file while it is being edited. 

As a result, you should not check files out unless you are actually making 
changes to them, since this prevents other people making needed changes. For 
example, a good scenario for working might be: 


tutorial% sees edit a.e g.e t.e 
tutorial% vl a.e g.e t.e 

# do testing of the {experimental) version 
tutorial% sees delget a.e g.e t.e 
tutorial! sees info 

# should respond "Nothing being edited” 
tutorial% make instal l 

--- V 

As a general mle, all source files should be checked in before installing the pro¬ 
gram for general use. This will ensure that it is possible to restore any version in 
use at any time. 
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7.9. Saving Yourself 

Recovering a Corrupted Edit 
File 


Restoring the History File 

CAUTION 

7.10. Managing SCCS-Files 
with sees admin 



Sometimes you may find that you have destroyed or trashed a file that you were 
trying to edit^. Unfortunately, you can’t just remove it and re-s cc s edit it; 
sees keeps track of the fact that someone is trying to edit it, so it won’t let you 
do it again. Simply using sees get, would expand the ID keywords, and 
besides, if there are edited portions edited file that you want to preserve, you 
don’t want to overwrite it. Instead, you can get a writable copy of the file (with 
unexpanded keywords) using the -k and -Gfilename options in combination. 

The -k option tells sees to get a writable version. The -Gfilename tells it to 
place the copy in the named file; 

, ^ .. ^ . 

tutorial! sees get -k -G/tn^/prog.c prog.c 
tutorial! Is -1 Prog.c prog.c 

-rw-r—r— 1 user 42652 May 20 17:21 /tmp/prog.c 

—rw-r—r — 1 user 43654 May 19 17:40 prog-c 

From here, you can use dif f and your favorite editor to selectively restore the 
changes you wish to keep. Of course, if you just want to start over, you can sim¬ 
ply sees unedit, and then sees edit the file once again. 


In particularly bad circumstances, the history file itself may get cormpted. The 
most common way this happens is for someone to edit it Since the file contains 
a checksum, you will get errors every time you read a corrupted file. To correct 
the checksum, use: 

tutorial! sees admin -z prog.c 
___^ 



When sees says that the history file is corrupted, it may indicate serious 
damage beyond an incorrect checksum. Be careful to safeguard your 
current changes before attempting to correct a history file. 


There are a number of parameters that can be set using the admin corrrmand. The 
most interesting of these are flags. Flags can be added by using the -f option. 
For example: 

tutorial% sees admin -fdl prog,c 

_ __ 

j 

sets the ‘d’ flag to the value ‘1’. This flag can be deleted by using: 

r 

tutorial! sees admin ~dd prog.c 


The most useful flags are: 

b Allow branches to be made using the -b option to sees edit. 


^ Or given up and decided to start over. 
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dSID 

Default SID to be used on a sees get or sees edit. If this is just a 
release number it constrains the version to a particular release only. 

i Give a fatal error if there are no ID keywords in a file. This is useful to 
guarantee that a version of the file does not get merged into the s.file that has 
the ID keywords inserted as constants instead of internal forms. 

y The ‘type’ of the module. Actually, the value of this flag is unused by 
sees, except that it replaces the % Y % keyword. 

-tfile 

store descriptive text from file in the SCCS-file. This descriptive text might 
be the documentation or a design and implementation document. Using the 
-t option ensures that if the SCCS-file is passed on to someone else, the 
documentation will go along with it. If file is omitted, the descriptive text is 
deleted. To see the descriptive text, use prt -t. 

The admin command can be used safely any number of times on files. A file 
need not be gotten for admin to work. 

7.11. Maintaining Different Sometimes it is convenient to maintain an experimental version of a program for 
Versions (Branches) an extended period while normal maintenance continues on the version in pro¬ 
duction. This can be done using a ‘branch’. Normally deltas continue in a 
straight line, each depending on the delta before. Creating a branch ‘forks off a 
version of the program. 

The ability to create branches must be enabled in advance using: 

tutorial% sees admin -fb prog.o 
The -f b option can be specified when the SCCS-file is first created. 

Creating a Branch To create a branch, use: 



This will create a branch with (for example) SID 1.5.1.1. The deltas for this ver¬ 
sion will be numbered 1.5.l.n. 


Getting From a Branch Deltas in a branch are normally not included when you do a get To get these 

versions, you will have to say: 
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Merging a Branch Back into 
the Main Trunk 


At some point you will have finished the experiment, and if it was successful you 
will want to incorporate it into the released versioa But in the meantime some¬ 
one may have created a delta 1.6 that you don’t want to lose. The commands: 



tutorial% 

sees edit -il.5.1.1-1.5.1 prog.e 

\ 


tutorial% 

sees delta prog.e 

j 


will merge all of your changes into the release system. If some of the changes 
conflict, delta will print an error. The generated result should be carefully 
examined before the delta is made. 


A More Detailed Example 


A Warning 


The following technique might be used to maintain a different version of a pro¬ 
gram. First, create a directory to contain the new version: 

tutorial% mkdir ../newxyz 
tutorial% cd ../newxyz 

___- 

Edit a copy of the program on a branch: 

-\ 

tutorial% sees — d../xyz edit —b ptog.e 

___ ! 

When using the old version, be sure to use the -b option to info, check, tell, and 
clean to avoid confusion. For example, use: v 


r 




. tutorial! secs info -b 


_ 


j 


when in the ‘xyz’ directory. 

If you want to save a copy of the program (still on the branch) back in the s.file, 
you can use: 

------ ■ ■ --------—- 

tutorial% sees -d,./xyz deledit prog.e 

- - -> 

which will do a delta on the branch and reedit it for you. 

When the experiment is complete, merge it back into the s.file using delta: 

< —— ^ N 

tutotial% sees -d../xyz delta prog.e 

At this point you must decide whether this version should be merged back into 
the trunk, that is, the default version, which may have undergone changes. If so, 
it can be merged using the —i option to sees edit as described above. 

Branches should be kept to a minimum. After the first branch from the trunk, 
SID’s are assigned raflier haphazardly, and the structure gets complex fast. 
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7.12. sees Quick 
Reference 

Commands 

sees get 


sees edit 


sees delta 


secs unedit 


sees prt 


sees info 


This list is not exhaustive; for more options see Appendix A of this manual. 

Gets files for compilation (not for editing). ID keywords are expanded. 

-rSJD Version to get. 

-p Send to standard output rather than to the actual file. 

—k Don’t expand ID keywords. 

—Gfilename 

Get to a named file. 

—Hist List of deltas to include. 

—xlist List of deltas to exclude. 

-m Precede each line with SID of creating delta. 

-edate Don’t apply any deltas created after date. 

Gets files for editing. ID keywords are not expanded. Should be matched with a 
delta command. 

—rSiD Same as for sees get. If 5/D specifies a release that does not yet 

exist, the highest numbered delta is retrieved and the new delta is 
numbered with 5/D. 

—b Create a branch. 

-Hist Same as for sees get. 

-xlist Same as for sees get. 

Merge a file gotten using edit back into the s.file. Collect comments about 
why this delta was made. 

Remove a file that has been edited previously without merging the changes into 
the s.file. 

Produce a report of changes. 

-t Print the descriptive text 

—e Print (nearly) everything. 

Give a list of all files being edited. 

-b Ignore branches. 

-u[Mser] Ignore files not being edited by user. 
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sees eheek 


Same as info, except that nothing is printed if nothing is being edited and exit 
status is returned. 


sees tell 

sees elean 
sees what 
sees admin 


sees fix 
sees delget 
sees deledit 
sees ereate 

sees diffs 

sees seesdiff 


Same as info, except that one line is produced per file being edited containing 
only the file name. 

Remove all files that can be regenerated from the s.file. 

Find and print ID keywords. 

Create or set parameters on s.files. 

—i-file Create, using yi/e as the initial contents. 

-z Rebuild the checksum in case the file has been trashed. 

—fflag Turn on flag. 

-dflag Turn off (delete) flag. 

—tfile Replace the descriptive text in the s.file with the contents of file. If 
file is omitted, the text is deleted. Useful for storing documentation 
or design and implementation documents to ensure they get distri¬ 
buted with the s.file. 

Useful flags that can be introduced via the -F and -d options are: 
b Allow branches to be made using the -b option to edit. 

dSiD Default SID to be used on a get or edit. 

i Make the ‘No Id Keywords’ error message a fatal error rather than a 

warning. 

t The module ‘type’; the value of this flag replaces the %Y% keyword. 

Remove a delta and teedit it. 

Do a delta followed by a get. 

Do a delta followed by an edit. 

Create a history file. Move the original file to a backup file with a comma prefix, 
-b Force the file to be treated as a binary file. 

Show line-by-line differences between the edited version and a checked-in ver¬ 
sion (the most recent by default). 

-rSlD Specify a version to compare against. 


Show line-by-line differences between two checked-in versions. 

-rSID Specify a version to compare against (You must specify two ver¬ 
sions to compare.) 
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ID Keywords 


%z% 

%M% 

%I% 

%W% 

%G% 

%R% 


%Y% 


Expands to ‘@(#)’ for the what command to find. 

The current module name, for example, prog. c. 

The highest SID applied. 

A shorthand for %Z%%M% < tab > %I%. 

The date of the delta corresponding to the %I% keyword. 

The current release number, that is, the first component of the %I% 
keyword. 

Replaced by the value of the t flag (set by admin). 




Revision A of 9 May 1988 




make User’s Guide 


make User’s Guide. 119 

8.1. Overview. 119 

Consistency Control. 119 

Dependency Checking: make vs. Shell Scripts. 119 

make Basics. 120 

Basic Use of Implicit Rules. 120 

Writing a Simple Makefile. 121 

Processing Dependencies. 123 

Missing Targets and Dependencies. 126 

Running Commands Silently. 127 

Ignoring a Command’s Exit Status. 127 

Automatic Extraction of SCCS Files. 128 

Suppressing SCCS Extraction. 129 

Passing Parameters: Simple make Macros. 129 

Command Dependency Checking and . KEEP_STATE. 131 

Suppressing or Forcing Command Dependency Checking 

for Selected Lines. 131 

The State File. 132 

Hidden Dependencies and . KEEP_STATE. 132 

Displaying Information About a make Run. 134 

8.2. Compiling Programs with make. 135 

Compilation Strategies. 135 

A Simple Makefile. 136 

Using make’s Predefined Macros. 137 






















Using Implicit Rules to Simplify a Makefile: Suffix Rules. 138 

When to Use Explicit Target Entries vs. Implicit Rules. 140 

Implicit Rules and Dynamic Macros. 141 

Dynamic Macro Modifiers. 141 

Dynamic Macros and the Dependency List: Delayed Macro 
References. 142 

How make Evaluates Dependencies. 142 

Adding Suffix Rules. 143 

Pattern Matching Rules: an Alternative to Suffix Rules. 144 

make’s Default Suffix Rules and Predefined Macros. 144 

8.3. Building Object Libraries. 147 

Libraries, Members and Symbols. 147 

Library Members and Dependency Checking. 147 

Library Member Name-Length Limit. 148 

. PRECIOUS : Preserving Libraries Against Removal Due 
to Interrupts. 148 

8.4. Maintaining Programs and Libraries With make. 149 

Using Macros for Added Flexibility. 149 

Embedded Macro References. 150 

A More Flexible Makefile. 150 

Makefiles as Specifications. 151 

Suffix Replacement in Macro References. 151 

Using lint with make. 151 

Linking With System-Supplied Libraries. 153 

Comialing Programs for Debugging and Profiling..... 154 

Conditional Macro Definitions. 155 

Compiling Debugging and Profiling Variants. 156 

Maintaining Separate Program and Library Variants. 158 

Pattern Replacement Macro References. 159 

Makefile for a Program with Separate Variants.. 160 

Makefile for a Library with Separate Variants. 163 

Maintaining a Directory of Header Files. 164 

Compiling and Linking With Your Own Libraries. 165 

Nested make Commands. 165 

Forcing A Nested make Command to Run. 166 







































The MAKEF LAGS Macro 


168 



Macro Definitions and Environment Variables: Passing 
Parameters to Nested make Commands. 

Compiling Other Source Files. 

Compiling and Linking a C Program with Assembly 
Language Routines. 

Compiling lex and yacc Sources. 

Specifying Target Groups With the + Sign. 

Maintaining Shell Scripts with make and SCCS. 

Running Tests with make 

Delayed References to a Shell Variable. 

8.5. Maintaining Software Projects. 

Organizing A Project for Ease of Maintenance. 

Using include Makefiles. 

Installing Finished Programs and Libraries. 

Building the Entire Project. 

Maintaining Directory Hierarchies With Recursive Makefiles 

Recursive install Targets. 

Maintaining A Large Library as a Hierarchy of Subsidiaries 
Closing Remarks about make. 


168 


171 

173 

174 
176 

176 

177 

178 

178 

179 

180 
181 
182 
183 
185 
189 
























make User’s Guide 


make streamlines the process of generating and maintaining object files and exe¬ 
cutable programs. It helps you to compile programs consistently, and eliminates 
unnecessary compilation of modules that are unaffected by source code changes. 

make provides a number of features that simplify compilations, but you can also 
use it to automate any complicated or repetitive task that isn’t interactive. For 
instance, you can use make to update and maintain object libraries, ran test 
suites, and install validated files onto a filesystem or tape. In conjunction with 
sees, you can use make to insure that all programs are built from the most 
recent source versions. You can also use make and Sees to build an entire 
software project, and to maintain the source files and directories from which that 
project is built 

The Source Code Control System, or SCCS, provides facilities for version control 
over source files. These include file locking, audit trails, commentary, and other 
useful features. Refer to Chapter 7 of this manual for an introduction to SCCS. 

make provides facilities for consistency control over the object files or other files 
derived from those sources. It rebuilds the files, in a modular and consistent 
fashion, when the source files they derive from have changed. 

make reads a file that you create, called a makefile, which contains information 
about what files to build and how to build them. Once you write and test the 
makefile, you can forget about the processing details; make takes care of them. 
This gives you more time to concentrate on debugging and correcting your code; 
the repetitive portion of the maintenance cycle is reduced to: 

think — edit — make — test... 

While it is possible to use a shell script to assure consistency in trivial cases, 
scripts are often inadequate in actual practice. On the one hand, you don’t want 
to wait for a simple-minded script to compile every single program or object 
module when only one of them has changed. On the other hand, having to edit 
the script for each iteration can defeat the objective of consistent compilation. 



Dependency Checking: make 
vs. Shell Scripts 


8.1. Overview 


This chapter describes Sun’s 
enhanced version of make, which 
includes new features such as hid¬ 
den dependency checking, com¬ 
mand dependency checking, 
pattern-matching implicit rules, and 
automatic extraction of SCCS files. 

It is highly compatible with 
makefiles written for previous ver¬ 
sions. Makefiles that rely on Sun’s 
enhancements may not be compati¬ 
ble with other versions of make. 
Refer to Appendix A for a complete 
summary of enhancements. 

Consistency Control 
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Although it is possible to write a script of sufficient complexity to process only 
those modules that require it, such scripts can often develop maintenance prob¬ 
lems of their own. In any ease, make eliminates the need for you to do so. 

make allows you to write a simple, structured listing of what to build and how to 
build it. It uses the mechanism of dependency checking to compare each module 
with the source files or intermediate files it derives from, make only rebuilds a 
module if one or more of these prerequisite files, called dependency files, has 
changed since the module was last built. To determine whether a derived file is 
out of date with respect to its sources, make compares the modification time of 
the module with that of the source file. If the module is missing, or if it is older 
than the source file, it is considered to be out of date; make issues the commands 
necessary to rebuild it. Optionally, a target can be treated as out of date if the 
commands used to build it have changed. 

Because make does a complete dependency scan, changes to a source file are 
consistently propagated through any number of intermediate files or processing 
steps. This lets you specify a hierarchy of processing steps in a top-down 
fashion. 

make Basics You can think of a makefile as a type of recipe, make reads the recipe, decides 

which steps need to be performed, and executes only those steps that are required 
to produce the finished product. Each file to build, or step to perform, is called a 
target. The makefile entry for a target contains its name, and a list of commands { j 
for building it called a rule, along with a list of dependencies, make treats 
dependencies as prerequisite targets, and updates fiiem if necessary, before pro¬ 
cessing the target that depends on them. 

The file for which the target is named is also referred to as a target file. Each file 
from which a target is derived (or that the target depends on) is called a depen¬ 
dency file with respect to that target. 

Basic Use of Implicit Rules In addition to any makefile(s) that you supply, make reads in the default 

makefile, /usr/include/make/default .mk, which contains target 
entries for implicit rules, as well as other information.® When there is no target 
entry in the makefile for a specified target, make attempts to select an implicit 
mle for building it. When it finds a mle for the target’s class, it applies die com¬ 
mands listed in the implicit rule’s target entry to build the specific target. 

There are two types of implicit mles. “Suffix” rules specify a set of command 
for building a file with one suffix from another file with the same basename but a 
different suffix. “Pattern-matching” rules select a mle based on a target and 
dependency pair matching a certain wild-card pattern. The default set of implicit 
mles provided by make are of the former type, namely, suffix rules. 


make assumes that only it will make 
changes to files being processed 
during the current make run. If a 
source file changes in the middie of 
the run, the files make produces 
may be in an inconsistent state. 


Implicit rules were hard-coded in earlier versions of make. 
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In some cases, the use of suffix rules can eliminate the need for writing a 
makefile entirely. For instance, to build an object file named go. o from a single 
C source file named go. c, you could use the command: 

make go.o 


as shown: 


( 

tutorial% make go.o cc 




'“sun4 “"C go.c “*0 go*o 





• •• V 


This would work equally well for building the object file nonesuch. o from the 
source file nonesuch. c. 


To build an executable file named go (with a null suffix) from go . c, you need 
only type the command: 

xnake go 


as shown: 



tutoriai% make go 

A 


cc -sun4 -0 go goJc 

-; 


The rule for building a . o file from a . c file is called the . c. o (pronounced 
"dot-c-dot-o") suffix rule. The rule for building an executable program from a 
. c file is called the . c (dot-c) rule. The complete set of default suffix rules is 
listed in Table 3-1. 


Writing a Simple Makefile The basic format for a makefile target entry is: 


Figure 8-1 

If there is no rule for a target entry, 
make looks for an implicit rule to 
use. 


Makefile Target Entry Format 


r — --- 

A 

target... i [ dependency ... ] 


[ command ] 


V 

- > 


If the dependency list is terminated 
with a semicolon and followed by a 
command, that command is 
included in the rule. However, 
makefiles tend to read better if you 
avoid this. 

Command lines in a rule start with a 
I TaB 1 : leading spaces are no sub¬ 
stitute as far as make is concerned. 


In the first line, the target name (or list) is followed by a colon, which is required. 
This, in turn, is followed by the dependency list if there is one. Several target 
names separated by white space can precede the colon; this indicates a list of 
independent targets that are built using the same dependency list and rule. 

Subsequent lines that start with a ( TAB I are taken as the commands lines that 
compme the target’s rule, make is awfully fussy about those leading I TAB I ’s. 

1 SPACE 1 characters simply won’t do. 

Lines that st art with a # are treated as comments up until the next (unescaped) 

[ NEWLINE) , and do not terminate the target entry. The target entry is terminated 
by the next nonempty line that begins with a character other than fTAB I or #, or 
by the end of the file. 
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A trivial makefile might consist of just one target: 

Figure 8-2 A Trivial Makefile 

-- ^ 

test: 

Is test 
touch test 

# 'test' is now present 

Is test 


The convention is to use the name 
Makefile, since filenames starting 
with a capital are listed first by Is; 
this highlights the fact that a 
makefile is present. 


When you run make with no arguments, it searches first for a file named 
makefile, or if there is no file by that name. Makefile. If either of these 
files is under SCCS control, make extracts the current version and uses it. 

If make finds a makefile, it begins the dependency check with the first target 
entry in that file. Otherwise you must list the targets to build as arguments on the 
command line, make displays each command it runs while building its targets. 

tutorial% make 
Is test 

test not found 
touch test 

< _ ; _ > 


Because the file test was not present (and therefore out of date), make per¬ 
formed the rule in its target entry. If you run make a second time, it issues a 
message indicating that the target is now up to date; 



and doesn’t perform the rule. 


make invokes a Bourne shell to pro¬ 
cess a command line if that line 
contains any shell metacharacters, 
such as a semicolon (;), redirection 
symbol (<, >, »,...) or pipe sym¬ 
bol ((), etc. If a shell isn’t required 
to parse the command line, etc. 
make invokes the command directly 
for better performance. 


Line breaks within a rule are significant in that each command line is performed 
by a separate process or shell. 


This means that a rule such as: 


r - 

test: 

cd /tmp 
pwd 

-N 

k_ 


J 


behaves differently than you might expect, as shown below. 
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tutorial% tnake test 
Od /tmp 

/usr/tutorial/wait^/arcana/rainot/pentangl^a 


You can use semicolons to specify a sequence of commands to perform in a sin¬ 
gle shell invocation: 


test: 

cd /tmp ; pwd 


Or, you can continue the input line onto the next line in the makefile by escaping 
the [TJEWLINE 1 with a backslash (\): 



/-- - — 


The backslash must be the last 

test: 


character on the line. 

cd /tinp ; \ 



pwd 



L— 

- —_ J 


Here is an example of a simple target entry to compile a C program from a single 
source file: 


Figure 8-3 Simple Target Entry for Compiling a C Program 


This entry performs the same func¬ 
tion with respect to go as in the 
second example of implicit rules 
shown above; it compiles an exe¬ 
cutable program from a C source 
file. 

Processing Dependencies 


go: go.c 


cc -sun4 -o go go.c 


Once make begins, it processes targets as it encounters them in its depth-first 
dependency scan. For example, with the following makefile: 


batch: 

a b 



touch 

batch 

b: 

touch 

b 

a; 

touch 

a 

c: 

echo " 

'you won't see me 


make starts with the target batch. Since batch has some dependencies that 
haven’t been checked yet, namely a and b, make defers checking batch tmtil 
after it has checked each of them against any dependencies they might have. 
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Since a has no dependencies, make processes it; if the file is not present make 
performs its rule. 

tutotial% make 


Next, make works its way back up to the parent target batch. Since there is 
still an unchecked dependency b, make descends to b and checks it. 



b also has no dependencies, so make processes it: 




Finally, now that all of the dependencies for batch have been checked and built 
if needed, make checks it against those dependency files: 

^^batch^ 



Since both a and b were built just now, and are therefore newer than batch, 
make builds it: 

touch batch 

V_ * 
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Altiiough there is a target entry for c in the makefile, make does not encounter it 
while performing its dependency scan. Target entries that aren’t encountered in 
the dependency scan are omitted from processing. You can select a starting tar¬ 
get like c by entering it as an argument to the make command: 



In this case, the targets are checked and processed as shown in the following 
diagram: 



1. make checks batch, for dependencies and notes that there are three, so it 
defers processing it. 

2. make checks a, the first dependency, and notes that it has two dependencies 
of its own. So, continuing in the same fashion, make: 

□ Checks al, and if necessary, rebuilds it. 

□ Checks a2, and rebuilds it if necessary. 

□ Determines whether to build a. 

3. make checks b and rebuilds it if need be. 
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Missing Targets and 
Dependencies 


4. Checks and rebuilds c if needed. 

5. After processing all of these nested dependencies, make checks and 
processes the topmost target, batch. 


If a target entry contains no rule, make attempts to select an implicit rule to build 
it If make cannot find an appropriate mle to apply and there is no SCCS file to 
extract it from, make presumes that the target has an empty rule, and continues 
processing subsequent targets. With this makefile: 



make stops processing and issues an error message if the target was named either 
on the command line or in a dependency list but it: 

□ is missing, 

o has no target entry, 

a no implicit rule can be used to build it, and 

□ there is no SCCS file to extract it from. 

The following command produces: 


tutorial^ make believe 

make: Fatal error; Don't know how to make target 'believe'. 


On the other hand, if the target entry has no rule, and make encbunters the target 
in a dependency list, it does not produce an error, either when processing the 
dependency, or when processing the target for which it is a dependency. This 
holds true, even if the dependency file is absent. 

make finds a target entry for the dependency. It executes the (null) rule for that 
dependency without encountering errors. So, make concludes that the depen¬ 
dency has been updated successfully, at the time that the (null) rale is performed. 
The dependency is therefore considered newer than the target, even though no 
dependencyyi/e exists. In a case such as this, make simply goes on to rebuild the 
parent target (after processing any remaining dependencies). With this makefile: 


You can use a dependency with a 
null rule to force the target’s rule to 
be executed. The conventional 
name for such a dependency is 
FORCE. 



haste: FORCE 

\ 


echo "haste makes waste” 



FORCE: 


is_ 


j 
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Running Commands Silently 


Special-function targets begin with 
a dot (.). Target names that begin 
with a dot are never used as the 
starting target, unless specifically 
requested as an argument on the 
command line. 

Ignoring a Command’s Exit 
Status 


make performs the mle for making haste, even if a file by that name is up to 
date: 



A 

tutoi:iai% touch haste 


tutorial% make haste 


echo "haste makes waste" 


haste makes waste 





You can inh ibit the display of a given command line by inserting an @ as the first 
non- LTABl character on that line. For example, the following target: 


f 




quiet: 



0 echo you only see me once 


< _ 




produces: 


f -—- 


tutorial! make cjuiet 


you. only see me once 


M. 



If you want to inhibit the display during a particular make run, you can use the 
-s option. If you want to inhibit the display of all command lines in every run, 
add the special target .SILENT to your makefile: 

f -—- 

.SILENT: 

quiet: 

echo you only see me once 

^ 


make normally issues an error message and stops when a command returns a 
nonzero exit code. For example, if you have the target: 

---N 

rmxyz: 

rm xyz 

--- V 


and there is no file named xyz, make halts after rm returns its exit status. 

tutorial! Is xyz 

xyz not found 

tut or i a 1% make rxoxyz 

rm: xyz: No such file or directory 
’Error code 1 

make: Fatal error: Command failed for target 'rmxyz' 

^> 
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If - and @ are the first two such 
characters, both take effect. 


To continue processing regardless of the command’s exit code, use a dash char¬ 
acter (-) as the first non- [ TAB ] character: 


rmxyz: 


-rm xyz 


In this case you get a warning message indicating the exit code make received: 


r ‘ • ■ • •* • “ --—---——.......—... 

tutorial% make rmxyz 

mrnmmimmmmi 

rm xyz .. 


rm: xyz: No such file or directory 


Error code 1 (ignored) 



J 


Unless you are testing a makefile, It 
is usually a bad Idea to Ignore non¬ 
zero error codes on a global basis. 
Specific commands that return 
non-zero status can be ignored in 
certain circumstances. But, in gen¬ 
eral, a non-zero exit code indicates 
trouble. It is best for make to stop 
so that you can diagnose the prob¬ 
lem right away. 


Although it is generally ill-advised to do so, you can have make ignore error 
codes entirely within a run with the -i option. You can also have make ignore 
exit codes when processing a given makefile, by adding the special target 
. IGNORE to your makefile, although this too should normally be avoided. 



.IGNORE: 



rmxyz: 



rm xyz 


< _ 




If you are processing a list of targets, and you want make to continue with the 
next target on the list, rather than stopping entirely after encountering an non¬ 
zero return code, use the -k option. 

Automatic Extraction of SCCS When source files are named in the dependency list, make treats them just like 
Files any other target file. Because the source file is presumed to be present in the 

directory, there is no need to add an entry for it to the makefile. When a target 
has no dependencies, but is present in the directory, make assumes that that file 
is up to date. If, however, a source file is under SCCS control, make does some 
additional checking to assure that the source file is the version most recently 
checked in. If the file is missing, or if there is a new version has been checked in, 
make automatically issues an 

"sees get ~s -Gfilename filename^' 

command to extract the most recent version:^ If, however, the source file is writ¬ 
able by anyone, make does not extract it. 


^ With other versions of make automatic SCCS extraction was a feature only of certain implicit mies. Also, 
unlike earlier versions, make only looks for history (s.) files in the SCCS subdirectory; s. files in the current 
directory are ignored. 
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" —. . .—.—. 

tutorial% Is SCCS/* 

. .A 

sees / s - go , c 


tutorial! rm -f go.c 


tutorial% make go 


sees get --s go,c -Ggo,c 


qc -suii4 -o go go.c 


^.:. 

. J 


This makes it unnecessary to add SCCS commands for extracting current versions 
of source files; make handles this for you automatically. 


Suppressing SCGS Extraction The command for extracting SCCS files is specified in the rule for the 

. SCCS_GET special target in the default makefile. To suppress automatic 
extraction, simply add an entry for this target, without any mle, to your makefile: 




A 


# Suppress sees extraetion. 



.SCCS_GET: 


s_ 




make’s macro substitution comes in handy when you want to pass parameters to 
commands lines within a makefile. Suppose that you sometimes wish to compile 
an optimized version of the program go using cc’s -0 option. You can lend this 
sort of flexibility to your makefile by adding a macro reference, such as the one 
below, to the target for go: 

—-—- 

go: go.c 

cc -sun4 $(CFLAGS) -o go go.c 

—----/ 

The macro reference acts as a placeholder for a value that you define, either in 
the makefile itself, or as an argument to the make command. If you then supply 
make with a definition for the CFLAGS macro, make replaces the macro refer¬ 
ence with the value you have defined. 


There is a reference to the cflags 
macro in both the . c and the . c . o 
implicit rules. 


f'" ‘ ..n.,uuM.M.M.n. , m i , I,,,.,,, --■ .-m . . . . m u 

. ' .>> 

tutorial^ rm go 


tutorial! make go "CFIAGS= -0" 


cc -sun4 -0 “O go go.c 





Passing Parameters: Simple 
make Macros 


If a macro is undefined, make replaces references to it with an empty string: 

r~- ..—---—--\ 

tutorial% rm go 
tutorial% make go 
cc “sun4 -o go go»c 

k.:_ > 

You can also include macro definitions in the makefile itself. A typical use is to 
set CFLAGS to -0 so that make produces optimized object code by default, as 
shown below. 
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CPLAGS= -O 

go: go.c 

cc -sun4 $(CFLAGS) -o go go.c 


With no arguments, the make command produces: 


tutorial make 
cc -sun4 -0 -o go go.c 


A macro definition supplied as an argument to make overrides all other 
definitions for that macro found in that make run. For instance, to compile go 
for debugging with dbx or dbxtool, you can define the value of CFLAGS to be 
-g in the make command: 






tutoriaX% rm go 



tutorial% make CFLAGS=-g 



cc -STan4 -g -o go go.c 

.!. j 


To compile a profiling version for use with gprof , supply both -0 and -pg in 
the value for CFLAGS: 

. ■■■ - 

t'utorial% rm go 

tutorial% make "CFIiAGS= -O -pg” 
cc “surx4 -0 -pg -o go go.c 

_ ] _ , 

A macro reference must include parentheses when the name of the macro is 
longer than one character. If the macro name is only one character, the 
parentheses can be omitted. Also, you can use curly braces, { and }, instead of 
parentheses. For example: 


^ - 

S= echo now and forever 
.SILENT: 

when: 

$S 

$(S) 

${S) 


are all three equivalent: 

tutorial% make when 
now and forever 
now and forever 
now and forever 



J 
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Command Dependency 
Checking and . KEEP STATE 


In addition to the normal dependency checking, you can use the special target 

• keep_STATE to activate comnuinddependency checking.^® When activated, 

make not only checks each target file against its dependency files, it compares 
each command line in the rule with the corresponding command line it ran the 
last time it built the target. (This information is stored in a state file in the 
current directory.) If the command line has changed, make rebuilds the target. 
So, if . KEEP_STATE were in effect for the previous few examples, you 
wouldn’t have had to type in all those rm go commands. 

With the makefile: 


the following commands work as shown: 


This assures you that make compiles a program with the options you want, even 
if a different variant of the file is present and newer than its dependencies. 

The first make run with . KEEP_STATE in effect recompiles all targets. This 
insures that they have, in fact, been built by the command line reported in the 
state file. 


tutorial% make 
cc -sun4 -O “O go go.c 
tutoirial% make CFIiAGS=s-g 
cc -sun4 -g -o go go.c 
l:utoi:ial% make "CFLAGS= -O -pg" 
cc -sun4 -O -pg -o go go.c 


CFLAGS= -O 
.KEEP_STATE: 
go: go.c 

cc -sun4 -o go go.c 


Suppressing or Forcing To inhibit command dependency checking for a given command line, insert a 

Command Dependency question mark as the first character after the TAB. For instance, without the ques- 

Checking for Selected Lines tion mark, this makefile: 

--- 

ARG= redone or not 

.KEEP_STATE: 

x: 

echo $(ARG) | tee x 

_____ _ _ > 


reprocesses x when you define ARG on the command line, as shown below. 


This feature is not available in earlier versions of make. 
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. .. . ^ ^ ' ...—\ 

. ttitorial% make x 
■ echo redone or not 1 tee x 
redone or not 

tutorial% make x "ARG=s redone this time" 
echo redone this, time. I tee x 
redone this.time 

. . ; 


Adding a ? as the first character after the I tAB I suppresses command depen¬ 
dency checking. 



ARG= is it redone 




.KEEP_STATE: 




x: 




? echo $(ARG) 

1 tee X 


S_ 



_/ 


With it, X is not reprocessed as a result of changing ARG, as shown: 


tutorial! make x 

echo is it redone I tee x 

is it redone 

tutorial! make x "ARG=5 still not redone" 
'x' is up to date. 


Command dependency checking is automatically suppressed for lines containing 
the dynamic macro $ ?, This macro stands for the list of dependencies that are 
newer than the current target, and can be expected to differ between any two 
make mns. (See Implicit Rules and Dynamic Macros for more information.) To 
force make to perform command dependency checking on a line containing this 
macro, prefix the command line with a ! character (following the I TAB 1 1. 

The State File When the . KEEP_STATE special target is in effect, make writes out a state file 

named .make. state, in the current directory. This file lists all targets that 
have ever been processed while . KEEP_STATE has been in effect, in a format 
similar to a makefile. In order to assure that this state file is maintained con¬ 
sistently, once you have added the . KEEP STATE special target to a makefile, 
we recommend that you leave it in effect. 

Hidden Dependencies and When a source file contains #include directives for interpolating header files, 

. KEEP_STATE the target depends just as much on those header files as it does on the sources that 

include them. Because such header files may not be listed explicitly as sources 
in the compilation command line, they are called hidden dependencies. When 
. KEEP_STATE is in effect, make receives a report from the various compilers 
and compilation preprocessors indicating which hidden dependency files were 

" Since this target is ignored in earlier versions of make, it does not introduce any compatibility problems. 
Other versions simply treat it as a superfluous target that no targets depend on, with an empty rule and no 
dependencies of its own. Since it starts with a dot, it is not used as the starting target. 
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interpolated for each target. It adds this information to the dependency list in 
Ae stete file. In subsequent runs, these additional dependencies are processed 
just like regular dependencies. This feature maintains die hidden dependency list 
for each target automatically; this insures that the dependency list for each target 
is always accurate and up to date. It also eliminates the need for the complicated 
schemes found in some earlier makefiles to generate complete dependency lists. 

A slight inconvenience can arise the first time make processes a target with hid¬ 
den dependencies, because there is as yet no record of them in the state file. If a 
header file is missing, and make has no record of it, make won’t know that it 
needs to extract it from SCCS, before compiling the target. So, even though there 
is an SCCS history file, the current version won’t be extracted because it doesn’t 
yet appear in a dependency list or the state file. So, when the C preprocessor 
attempts to interpolate the header, it won’t find it; the compilation fails. 

Supposing that an #include directive for interpolating the header file 
hidden. h is added to go. c, and that the file hidden. h is somehow removed 
before the subsequent make run. The results would be: 

<———--—'— _____--— „— 

tutorial% make go 
cc -sun4 -0 -o go go.c 

go.c: 2t Can't find include file hidden.h 
make: Fatal error; Command failed for target 'go' 

Vs__ 

The workaround is simple. Just make sure that the new header file is present in 
the directory before you run make. Or, if the compilation should fail (and 
assuming the header file is under SCCS), extract it from SCCS manually: 

tutoriaI% sees get hidden.h 

10 lines 

tutorial% make go 
CO “Sun4 “O go go^c 


In future cases, should the header file turn up missing, make will know to build 
or extract it for you, because it will be listed in the state file as a hidden depen¬ 
dency: 


..■ . 

tutorial% rm go hidden.h 


tutOrial% make go 


sees get -s hidden.h -Chidden.h 


CG '^Sun4 -0 go go.c 

___> 


Note that with hidden dependency checking, the $ ? macro includes the names 
of hidden dependency files. This may cause unexpected behavior in existing 
makefiles that rely on $ ?. 


Also unavailable with earlier versions of make. 
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Displaying Information About 
a make Run 

There is an exception to this how¬ 
ever. make executes any command 
line containing a reference to the 
MAKE macro (i.e., $ (make) or 
$ {MAKE}), regardless of -n. So, it 
would be a very bad idea to include 
a line like: “$ (make) ; rm -f *” 
in your makefile. 


Running make with the -n option displays the commands make is to perform, 
without executing them. This comes in handy when verifying that the macros in 

a makefile are expanded as expected. With the following makefile: 

—__—___—--- 

CFLAGS= -0 
CPPFLAGS= 

LDFLAGS= 

.KEEP_STATE; 

program: main.o data.o 

$(LINK.c) -o program main.o data.o 
^_> 


make -n displays: 


f 


A 


tutorial% make -n 



cc -0 -sun4 -c main.c 



cc -0 sun4 -c data-c 



cc -0 -sun4 -o program main.o data.o 

_ J 


make has some other options that you can use to keep abreast of what it’s doing 
and why: 


Setting an environment variable 
named makeflags can lead to 
complications, since make adds its 
value to the list of options. To 
prevent puzzling surprises, avoid 
setting this variable. 


-d Displays the criteria by which make determines that a target is be out- 
of-date. Unlike -n, it does process targets, as shown below. This 
options also displays the value imported from the environment (null by 
default) for the MAKEFLAGS macro, which is described in detail in a 
later section. 


tutorial% itiake -d 
: MAKEFLAGS value: 

Building main.o using auifix rule tot because it is out dat>a relative to maln*o 

cc “O -sun4 -c main.c 

Building program because it is out of date relative to main.o 
Building data.o using suffix anale for .c.o because it is out of date relative to data.o 
cc -O -sun4 -c data.o 

Building program because it is out of date relative to data.o 
cc -O '-sun4 -Q program main.o data.o 

^J 


This option displays all dependencies make checks in vast detail. 

-D Displays the text of the makefile as it is read. 

“DD Displays the makefile and the default makefile, the state file, and hidden 
dependency reports for the current make run. 
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Several -f options indicate the con- -f makefile 

catenation of the named makefiles. , , .... 

make uses the named makefile (instead of makefile or Makefile). 

—p Displays the complete set of macro definitions and target entries. 

-P Displays the complete dependency tree for each target encountered. 


Due to its potentially troublesome 
side effects, we recommend against 
using the -t (touch) option for 

make. 


There is an option that can be used to shortcut make processing, the —t option. 
When run with -t, make does not perform the rule for building a target. Instead 
it uses touch to alter the modification time for each target that it encounters in 
the dependency scan. It also updates the state file to show reflect what it built. 
This often creates more problems than it supposedly solves, and so we recom¬ 
mend that you exercise extreme caution if you do use it. Note that if there is no 
file corresponding to a target entry touch creates it 


The following is one example ofhow nor to use make -t. Suppose you have a 
target named clean that performed housekeeping in the directory by removing 
target files produced by make: 


------^_ 

clean: 


rm program main.o data.o 


If you give the erroneous command: 

tutorial% make -t clean 
touch clean 
tutorial! make clean 
'clean' is up to date. 



_> 


you then have to remove the file clean before your housekeeping target can 
work once again. 


For a complete listing of all make options, refer to make(l) in the SunOS Refer¬ 
ence Manual. 


8.2. Compiling Programs 
with make 


Compilation Strategies In previous examples you have seen how to compile a simple C program fiom a 

single source file, using both explicit target entries and implicit rules. Most C 
programs, however, are compiled from several source files. Many include library 
routines, either from one of the standard system libraries or from a local library. 
Although it may be easier to recompile and link a single-source program using a 
single cc command, it is usually more convenient to compile programs with 
multiple sources in stages—^first, by compiling each source file into a separate 
object (. o) file, and then by linking the object files to form an executable pro¬ 
gram (an a. out format file). This method requires more disk space, but subse¬ 
quent (repetitive) recompilations need be performed only on those object files for 
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A Simple Makefile 


which the sources have changed. The time saved is usually worth the extra space 
required, since the remaining, up-to-date, object files are simply relinked as is 
into a newly produced executable program. 

The makefile that follows compiles an executable program from two C source 
files. In subsequent examples, this makefile will be refined and enhanced to take 
advantage of make’s predefined macros and implicit rules. Subsequent sections 
describe the mechanics of implicit mles, including how to add new ones of your 
own. 

Then, additional features are introduced that are useful in makefiles for maintain¬ 
ing C object libraries. Later sections expand upon these examples to create 
sophisticated templates that are easily modified to handle a variety of programs 
or libraries. 

Further examples illustrate template makefiles for more complex operations, such 
as linking programs with with user-supplied object libraries (from other direc¬ 
tories), lining C programs with assembly language routines, and compiling pro¬ 
grams from lex and yacc sources. 

The makefile below is not very flexible or elegant, but it does the job. 

Figure 8-4 Simple Makefile for Compiling C Sources: Everything Explicit 

f - A 

# Simple makefile for compiling a program from 

# two C source files. 

.KEEP_STATE: 

program: main.o data.o 

cc -sun4 -o program main.o data.o 

main.o: main.c 

cc -sun4 -0 -c main.c 

data.o: data.c 

cc -sun4 -0 -c data.c 

clean: 

rm program main.o data.o 

< _ > 


In this example, the command: 
make 

produces the object files main. o and data. o, and the executable file 
program. 


Makefiles for programs and libraries written in other compiled languages, such as FORTRAN 77, Pascal, 
and Modula-2, are analogous. 
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Conventions have evolved for the 
use of certain target names, such 
as all, clean and install, 
among others. There may be other 
conventions in your organization. In 
general, it is a good idea to avoid 
creating files by any such name in 
your source directories. 


The last target, clean, removes these files. This is a common addition to sim¬ 
plify housekeeping chores. The name clean is a convention for targets that 
removes derived files. 


Using make’s Predefined 
Macros 


The next example performs exactly the same function, but demonstrates the use 
of make’s predefined macros for the indicated compilation commands. Using 
predefined macros eliminates the need to edit makefiles when the underlying 
compilation environment changes. They also provide access to the CFLAGS 
macro (and other FLAGS macros) for supplying compiler options from the com¬ 
mand line. Predefined macros are also used extensively within maka’s implicit 
mles. The predefined macros in the following makefile are listed below. They 
are generally useful for compiling C programs. 


Macro names that end in the string 
FLAGS are used to pass options to 
a related compiler-command macro. 
It is good practice to use these 
macros for consistency and porta¬ 
bility. It is also good practice to 
note the desired default values for 
them in the makefile. 

The complete list of all predefined 
macros is shown in Table 1.2, 
below. 


COMPILE . c The complete cc command line; composed of the values of 
CC, CFLAGS, CPPFLAGS, and TARGET_ARCH, as follows, 
along with the -c option. 

COMPILE.c=$(CC) $(CFLAGS) $(CPPFLAGS) $(TARGET_ARCH) -c 

The root of the macro name, COMPILE, is a convention used 
to indicate that the macro stands for an entire compilation 
command line. The . c suffix is a mnemonic device to indi¬ 
cate that the command line applies to . c (C source) files. 

LINK . c The complete cc command line to link object files, like 

COMPILE . c, but without the -c option and with a reference 
to the LDFLAGS macro: 


LINK.c-$(CC) $(CFLAGS) $(CPPFLAGS) $(LDFLAGS) $(TARGET ARCH) 

CC The value cc. (You can redefine the value to be the pathname 

of an alternate C compiler.) 

CFLAGS Options for the cc command; none by default 

CPPFLAGS Options for cpp; none by default. 

LDFLAGS Options for the link editor. Id; none by default. 


Ptedeflned macros are used more extensively than in earlier versions of make. Not all of the predefined 
macros shown here are available with earlier versions. 
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AR 

ARFLAGS 


The ar command, which is used for maintaining library 
archives. 

Rags for ar. The default value is 
rv 



TARGET_ARCH The target-architecture argument to cc used for cross¬ 
compiling. The default is set by make to the value returned 
by the arch command. 

TARGET_MACH The target machine-type argument to cc that is used for 
cross-compiling. The default is set by make to the value 
returned by the mach command. Refer to Cross-Compilation 
on the Sun Workstation for details. 


Figure 8-5 


Makefile for Compiling C Sources Using Predefined Macros 

# Makefile for compiling two C sources 

# using predefined macros. 

CFLAGS= -0 
CPPFLAGS= 

LDFLAGS= 

.KEEP_STATE: 

program: main.o data.o 

$(LlNK.c) -o program main.o data.o 

main.o: main.o 

$(COMPILE.c) main.c 

data.o: data.c 

$(COMPILE.c) data.c 

clean: 

rm program main.o data.o 

's_^ 



Using Implicit Rules to Since the command lines for compiling main.o and data. o from their respec- 

Simplify a Makefile: Suffix tive . c files are now functionally equivalent to the . c. o suffix rule, their target 

Rules entries are, in a sense, redundant; make performs the same compilation whether 

they appear in the makefile or not. This next version of the makefile eliminates 
them, relying on the . c. o rule to compile the individual object files. 


o 
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Figure 8-6 Makefile for Compiling C Sources Using Suffix Rules 

/ --- — --- 

# Makefile for a program from -two C sources 

# using suffix rules. 

CFLAGS= -0 
CPPFLAGS= 

LDFLAGS= 

.KEEP_STATE: 

program: main.o data.o 

$(LINK.c) -o program main.o data.o 

clean: 

rm program main.o data.o 


A complete list of suffix rules 
appears in Table 3-1. 


As make processes the dependencies main. o and data. o, it finds no target 
entries for them. So, it checks for an appropriate implicit rule to apply. In this 
case, make selects the . c. o mle for building a . o file from a dependency file 
that has the same basename and a . c suffix. 


make uses the order of appearance 
in the suffixes list to determine 
which dependency file and suffix 
rule to use. For instance, if there 
were both main. c and main. s 
files In the directory, make would 
use the . c . o rule, since . c is 
ahead of . s in the list. 


First, make scans its suffixes list to see if the suffix for the target file appears. In 
the case of main. o, the string . o appears in the list. Next, make checks for an 
suffix rule to build it with, and a dependency file to build it from. The depen¬ 
dency file has the same basename as the target, but a different suffix. In this 
case, while checking the . c. o mle, make finds a dependency file named 
main. c, so it selects that mle. The target entry for the suffix mle is named for 
the dependency suffix and the target suffix; the name is composed of the two 
suffixes, in this case the target name becomes . c. o, make applies the mle given 
in the target entry by that name (in the default makefile). 

The suffixes list is a special-function target named .SUFFIXES. The various 
suffixes are included in the definition for the SUFFIXES macro; the dependency 
list for . SUFFIXES is given as a reference to this macro: 


Figure 8-7 The Standard Suffixes List 



SUFFIXES= .0 

.mod 

.c .c' .s .s~ .S .S'^ 
.mod"* .sym .def .def^* 

■In .f .f- 
.p .p- .r 

.F .F- .1 .1' \ 

.r~ .y .y~ .h .h~ .sh .sh~ 



.SUFFIXES: 

$(SUFFIXES) 



J 
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The following example shows a makefile for compiling a whole set of executable 
programs, each having just one source file. Each executable is to be built from a 
source file that has the same basename, and the . c suffix appended. For instance 
demo 1 is built from demo 1. c. 


Like clean, all is a target name 
used by convention. It builds "all” 
the targets in its dependency list. 
Normally, all is the first target; 
make and make all are usually 
equivalent. 


- 

# Makefile for a set of C programs^ one source 

# per program. The source file names have ”.c** 

# appended. 

CFLAGS= -0 
CPPFLAGS= 

LDFLAGS= 

.KEEP_STATE: 

all: demo_l demo_2 demo_3 demo_4 demo_5 

V___/ 


In this case, make does not find a suffix match for any of the targets (demo_l 
through demo__5). So, it treats each as if it had a null suffix. It then searches for 
an suffix rule and dependency file with a valid suffix. In the case of demo_2, it 
would find a file named demo_2 , c. Since there is a target entry for a . c(null) 
rule, namely the . c rule, along with a corresponding . c file make uses the rule 
in the , c target entiy to build demo_2 from demo_2 . c. 

There is no transitive closure for suffix rules. If you had a suffix rule for build¬ 
ing, say, a . y file from a . x file, and another for building a . z file from a . y 
file, make would not combine their rules to build a . z file from a . x file. You 
must specify the intermediate steps as targets, as in the next example. 

r . .>1 

ttitorial% Is mcp.fxyzl 

tutorial% make mcp.z 

Don't know how to make mcp.z^ Stop, 

ttitorial% make mcp.y mcp.z 

cp mop.x mcp.y 

cp mcp, y mcp. z ' 

V_;_ 


When to Use Explicit Target Whenever you build a target from multiple dependency files, you must provide 

Entries vs. Implicit Rules make with an explicit target entry that contains a rule for doing so. When build¬ 

ing a target from a single dependency file, it is often convenient to use an impli¬ 
cit rule. 

As the previous examples show, make is happy to compile a single source file 
into a corresponding object file or executable. However, it has no built-in 
knowledge whatsoever about how to collate several files into one. For instance, 
it has no idea of the order in which to link a list of object files into an executable 
program. Also, make only compiles those object files that it encounters in its 
dependency scan. It needs a starting point—a target for which each object file in 
the list (and ultimately, each source file) is a dependency. 
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So, for a target built from multiple dependency files, make needs an explicit rule 
that provides a collating order, and a dependency list that accounts for all of its 
dependency files. On the other hand, if each of those dependency files is built 
from just one source, you could use an implicit rule to build them. 


Implicit Rules and Dynamic 
Macros 


Because they aren’t explicitly 
defined in a makefile, the conven¬ 
tion is to document dynamic macros 
with the $-sign prefix attached (in 
other words, by showing the macro 
reference). 


make maintains a set of macros dynamically, on a target-by-target basis. These 
macros are used quite extensively, especially in the definitions of implicit mles. 
So, it is important to understand what they mean. 

They are: 

$ @ The name of the current target. 

$ ? The list of dependencies newer than the target. 

$< The name of the dependency file, as if selected by make for use with an 
implicit mle. 

$ * The basename of the current target (the target name stripped of its suffix). 

$ % For libraries, the name of the member being processed. See Building Object 
Libraries, below, for more information. 


Implicit mles make use of these dynamic macros in order to supply the name of a 
target or dependency file to a command line within the mle itself. For instance, 
in the . c. o mle, shown in the next example. 


The macro 0UTPUT_0PTI0N has 
an empty value by default. While 
similar to CFLAGS in function, it is 
provided as a separate macro, 
intended for passing in the -o . 

filename compiler option, as needed, replaced by the name of the dependency file (in this case the . c file) for 

to force compiler output to a given the current target, 
filename. 

In the . c mle: 


.c.o: 

$(COMPILE.c) $< $(OUTPUT_OPTION) 


. c: 

$(LINK.c) $< -o $0 


Dynamic Macro Modifiers 


$@ is replaced with the name of the current target. 

Because values for the $< and $ * macros depend upon both the order of suffixes 
in the suffixes list, you may get surprising results when you use them in an expli¬ 
cit target entry. See Suffix Replacement in Macro References for a strictly deter¬ 
ministic method for deriving a filename from a related filename. 

Dynamic macros can be modified by including F and D in the reference. If the 
target being processed is in the form of a pathname, $ (@F) indicates the 
filename part, while $ ( @D) indicates the directory part. If there are no / charac¬ 
ters in the target name, then $ (@D) is assigned the dot character (.) as its value. 
For example, with the target named /tmp/test, $ (@D) has the value /tmp; 

$ (@F) has the value test. 
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Dynamic Macros and the 
Dependency List: Delayed 
Macro References 


How make Evaluates 
Dependencies 


Dynamic macros are assigned while processing any and all targets. They can be 
used within the target’s rule as is, or in the dependency list by prepending an 
additional $ character to the reference. A reference beginning with $ $ is called a 
delayed reference to a macro. For instance, the entry: 

-^ 

x.o y.o z.o: $$0.BAK 
cp $@.BAK $0 

V_/ 


could be used to copy x. o from a backup copy named x.o. BAK, and so forth 
for y. o and z . o. 


This technique works because make reads the dependency list twice, once as it 
starts up, and again as it encounters each target while following the dependency 
scaiL Each time it does so, it resolves any macro references contained in the 
dependency list. Before processing any dependencies, the dynamic macros aren’t 
defined. Unless the references are delayed until the second pass, make would 
resolve them to an empty value. The string $$ is a reference to the predefined 
macro ‘$’. This macro, conveniently enough, has the value ‘$’; when make 
resolves it in the first (parsing) pass, the string $ $ * is resolved to $ *. Then, in 
the second pass, the $ * macro reference has a value dynamically assigned to it, 
so make resolves the reference to that value. 


Note that make only evaluate the target-name portion of a target entry in the first 
pass. A delayed macro reference as a target name will produce incorrect results. 
The makefile: 


NONE= none 
all: $(NONE) 

$$(NONE): 

0: this target's name isn't 'none' 


produces: 

-\ 

tutorial% make 

make: Fatal error: Don't know how to make target 'none' 

L._—.......; 


However, the $$ notation can be 
used, as described under Delayed 
References to a Shell Variable below, 
to pass a shell variable reference to 
the shell interpreting the command 
line. 


Also note that make evaluates the mle portion of a target entry only once, at the 
time that the rule is executed. Here again, a delayed reference to a make macro 
will produce incorrect results. 
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Adding Suffix Rules 

Pattern matching rules, which are 
described in the previous section, 
are often easier to use than. The 
procedure for adding impiicit ruies is 
given here for compatibiiity with pre¬ 
vious versions of make. 


Although make supplies you with a number of useful suffix rules, you can also 
add new ones of your own design. However, pattern matching rules,^^ which are 
described in the next section, are to be preferred when adding new implicit rules. 
Unless you need to write implicit rules that are compatible with earlier versions 
of make, you may safely skip the remainder of this section, which describes the 
traditional method of adding implicit rules to makefiles. 


Adding a suffix rule is a two-step process. First, you must add the suffixes of 
both target and dependency file to the suffixes list by providing them as depen¬ 
dencies to the . SUFFIXES special target Because dependency lists accumu¬ 
late, you can add suffixes to the list simply by adding another entry for this tar¬ 
get, for example: 


/--- — --- 

.SUFFIXES: .ms .tr 

\ 

Second, you must add a target entry for the suffix mle: 


"N 

.ms.tr: 


troff -t -ms $< > $@ 





A makefile with these entries can be used to format document source files con¬ 
taining ms macros (. ms files) into trof f output files (. tr files): 




tutorial! make doc-tr 


troff-t-ms doc.ms > doc.tr 



_ V 


Entries in the suffixes list are contained in the SUFFIXES^^ macro. To insert 
suffixes at the head of the list, first clear its value by supplying an entry for the 
. SUFFIXES target that has no dependencies. This is an exception to the mle 
that dependency lists accumulate. You can clear a previous definition for any 
target with a name starting with the character ‘’ by supplying a target entry for 
that target with no dependencies and no rule,^^ like this: 


.SUFFIXES: 


When you do, both the previous mle, and the previous dependency list are 
erased. You can then add another entry containing the new suffixes, followed by 
a reference to the SUFFIXES macro, as shown below. 


Not available with earlier versions of make. 

Note that there is no leading dot. 

You can only clear the dependency list for the . SUFFIXES target in previous versions of make. 
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.SUFFIXES: 

.SUFFIXES: .ms .tr $(SUFFIXES) 

-\ 

s._ 


_/ 


Pattern Matching Rules: an A pattern matching rule is similar to an implicit mle in function. Pattern match- 

Alternative to Suffix Rules ing rules are easier to write, and more powerful, because you can specify a rela¬ 

tionship between a target and a dependency based on prefixes and suffixes both. 
A pattern matching mle is a target entry of the form: 

tp^ts : dp^ds 
rule 


where tp and ts are the optional prefix and suffix in the target name, respectively, 
dp and ds are the (optional) prefix and suffix in the dependency name, and % is a 
wild card that stands for a basename common to both. 


make checks for pattern matching 
rules ahead of suffix rules. While 
this allows you to override the stan¬ 
dard implicit rules, doing so is not 
recommended. 


If there is no mle for building a target, make searches for a pattern matching 
mle, before checking for a suffix mle. If make can use a pattern matching mle, 
it does so. 

If the target pattern matches the target name, there is a dependency file matching 
the dependency pattern, and the target is out of date with respect to that depen¬ 
dency file, make rebuilds the target. If the target is up to date with respect to the 
dependency, make does not rebuild it, and continues processing with the next tar¬ 
get in the dependency hierarchy. 

If the target entry for a pattern matching mle contains no mle, make processes 
the target file as if it had an explicit target entry with no mle; it therefore searches 
for a suffix mle, attempts to extract a version of the target file from SCCS, and 
finally, treats the target as having a null mle (flagging the target as updated, 
which forces any parent target to be rebuilt). 

A pattern matching mle for formatting a trof f source file into a trof f output 
file looks like: 


%.tr: %.ms 

troff -t -ms $< > $@ 

_ __ / 


This is much easier to write, and much simpler to follow than the equivalent 
suffix mle would be. 


make’s Default Suffix Rules 
and Predefined Macros 


The following tables list the standard set of suffix 
supplied with make. 


mles and predefined macros 
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Table 8-1 make’j Standard Suffix Rules 


Suffix Rule Name 



_ Command Line(s) 


$(COMPILE.s) -o $0 $< 


$(COMPILE.s) -o $% $< 
$(AR) $(ARFLAGS) $@ $% 
$(RM) $% 


$(COMPILE.S) -o $0 $< 


$(COMPILE.S) -o $% $< 
$(AR) $(ARFLAGS) $0 $% 
$(RM) $% 


$(LINK.c) -o $0 $< $(LDLIBS) 


$(LINT.c) $(OUTPUT OPTION) -i $< 


$(COMPILE.c) $(OUTPUT OPTION) $< 


$(COMPILE.c) -o $% $< 
$(AR) $(ARFLAGS) $@ $% 
$(RM) $% 


$(LINK.f) -o $0 $< $(LDLIBS) 


$(COMPILE.f) $(OUTPUT OPTION) $< 


$(COMPILE.f) -o $% $< 
$(AR) $(ARFLAGS) $0 $% 
$(RM) $% 


$(LINK.F) ~o $0 $< $(LDLIBS) 


$(COMPILE.F) $(OUTPUT OPTION) $< 


$(COMPILE.F) -o $% $< 
$(AR) $(ARFLAGS) $0 $% 
$(RM) $% 


$(RM) $*.c 
$(LEX.l) $< > $*.c 
$(LINK.c) -o $0 $*.c $(LDLIBS) 
$(RM) $*.c 


$(RM) $0 

$(LEX.l) $< > $0 


$(RM) $*.c 
$(LEX.l) $< > $*.c 
$(LINT.c) -o $0 -i $*.c 
$(RM) $*.c 


$(RM) $*.c 
$(LEX.l) $< > $*.c 
$(COMPILE.c) -o $0 $*.c 
$(RM) $*.c 


$(COMPILE.mod) —o $0 —e $@ $< 
$(COMPILE.mod) -o $0 $< 

$(COMPILE.def) -o $0 $< 


cps $*.cps 


$(LINK.p) -o $0 $< $(LDLIBS) 


$(COMPILE.p) $(OUTPUT OPTION) $< 


$(LINK.r) -o $0 $< $(LDLIBS) 


$(COMPILE.r) $(OUTPUT OPTION) $< 


$(COMPILE.r) -o $% $< 
$(AR) $(ARFLAGS) $0 $% 
$(RM) $% 


A sun 

mioiweuctMTM 
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Table 8-1 make'.? Standard Suffix Rules — Continued 


Use 

Suffix Rule Name 

Command Line(s) 

Shell 

Scripts 

. sh 

cat $< >$@ 
chmod +x $@ 

yacc 

Files 

•y 

$(YAGC.y) $< 

$(LINK.c) -o $@ y.tab.o $(LDLIBS) 

$(RM) y.tab.c 


.y.c 

${YACC.y) $< 
mv y.tab.c $@ 


.y-In 

$(YACC.y> $< 

$(LINT.c> -o $@ -i y.tab.c 
$(RM) y.tab.c 


.y.o 

$(YACC.y) $< 

$(COMPILE.c) —o $0 y.tab.c 
$(RM) y.tab.c 


Table 8-2 make'.y Predefined and Dynamic Macros 


Use 

Macro 

Default Value 

Library 

AR 

ar 

Archives 

ARFLAGS 

rv 

Assembler 

AS 

as 

Commands 

ASFLAGS 



COMPILE.s 

$(AS) $(ASFLAGS) $(TARGET ARCH) 


COMPILE.S 

$(CC) $(ASFLAGS) $(CPPFLAGS) $(TARGET_ARCH) -c 

C Compiler 

CC 

CC 

Commands 

CFLAGS 



CPPFLAGS 



COMPILE.c 

$(CC) $(CFLAGS) $(CPPFLAGS) $(TARGET ARCH) -c 


LINK.c 

$(CC) $(CFLAGS) $(CPPFLAGS) $(LDFLAGS) $(TARGET_ARCH) 

FORTRAN 77 

FC 

fll 

Compiler 

FFLAGS 


Commands 

COMPILE.f 

$(FC) $(FFLAGS) $(TARGET ARCH) -c 


LINK.f 

$(FC) $(FFLAGS) $(TARGET ARCH) $(LDFLAGS) 


COMPILE.F 

$(FC) $(FFLAGS) $(CPPFLAGS) $(TARGET ARCH) -c 


LINK.F 

$(FC) $(FFLAGS) $(CPPFLAGS) $(LDFLAGS) $(TARGET ARCH) 

Link Editor 

LD 

Id 

Command 

LDFLAGS 


lex 

LEX 

lex 

Command 

LFLAGS 



LEX.l 

$(LEX) $(LFLAGS) -t 

lint 

LINT 

lint 

Command 

LINTFLAGS 



LINT.C 

$(LINT) $(LINTFLAGS) $(CPPFLAGS) $(TARGET_ARCH) 

Modula 2 

M2C 

m2c ^ 

Commands 

M2FLAGS 



MODFLAGS 



DEFFLAGS 



COMPILE.def 

$(M2C) $(M2FLAGS) $(DEFFLAGS) $(TARGET_ARCH) 


COMPILE.mod 

$(M2C) $(M2FLAGS) $(MODFLAGS) $(TARGET ARCH) 

Pascal 

PC 

pc 

Compiler 

PFLAGS 


Commands 

COMPILE.p 

$(PC) $(PFLAGS) $(CPPFLAGS) $(TARGET_ARCH) -c 


LINK.p 

$(PC) $(PFLAGS) $(CPPFLAGS) $(LDFLAGS) $(TARGET ARCH) 

Ratfor 

RFLAGS 


Compilation 

COMPILE.r 

$(FC) $(FFLAGS) $(RFLAGS) $(TARGET_ARCH) -c 

Commands 

LINK.r 

$(FC) $(FFLAGS) $(RFLAGS) $(TARGET ARCH) $(LDFLAGS) 


S U H Revision A of 9 May 1988 

XT microsystems ^ 
























Chapter 8 


make User’s Guide 


147 


I 



Table 8-2 make’^ Predefined and Dynamic Macros Continued 


Use 

Macro 

Default Value 

rm 

RM 


Command 


rm -f 

yacc 

YACC 

yacc 

Command 

YFLAGS 


YACC.y 

$(YACC) $(YFLAGS) 

Siffijces 


.0 .c .c' .s .s” .S .S- .In .f .f- .F .F" .1 

List 

SUFFIXES 

.I" .mod .mod" .sym .def .def~ .p .p" .r .r" 

•y -y*" -h' . sh .sh” .cps .cps" 


8.3. Building Object 
Libraries 

Libraries, Members and 
Symbols 



Library Members and 
Dependency Checking 


An object library is a set of object files contained in an ar library archive.^® 

Various languages make use of object libraries to store compiled functions of 
general utility, such as those in the C library. 

ar reads in a set of one or more files to create a library. Each member contains 
the text of one file, preceded by a header. This header contains information taken 
from the file’s directory entry when the text is read in, including the modification 
time, make can treat the library member as a separate entity for dependency 
checking using this header. 

^en you compile a program that uses functions from an object library (specify¬ 
ing the proper library either by filename, or with the -1 option to cc), the link 
editor selects and links with the library member that contains a needed function 
or symbol. 

You can use ranlib to generate a symbol table for a library of object files. Id 
uses this table for random access to symbols within the library—to locate and 
link object files in which functions are defined. You can also use border and 
t sort ahead of time to put members in calling order within the library. (See 
lorder(l) for details.) For very large libraries, it is a good idea to do both. 

make recognizes a target or dependency of the form 
lib.a (member ... ) 

as a reference to a library member, or a space-separated list of members. For 
example, the following target entry indicates that the library named librpn. a 
is built from members named stacks . o and f if os . o. The pattern matching 
rule indicates that each member depends on a corresponding object file, and that 
object file is built from its corresponding source file using an implicit rule. 


f 




See ar(l), ar(5), lorder(l), and ranlib(l) in the Commands Reference Manual foT details about 
libraiy archive files. 

Earlier versions make recognize this notation. However, only the first item in a parenthesized list of 
members was processed. In this version of make, all membeis in a parenthesized list are processed. 



microsystems 


Revision A of 9 May 1988 




148 Programming Utilities and Libraries 


r 


librpn.a: librpn.a(stacks .0 fifos. 0 ) 


ar rv $@ $? 


ranlib $@ 


lib.a(%.o): %.o 





When used with library-member notation, the dynamic macro $ ? contains the 
list of files that are newer than their corresponding members: 

-- --- —-^ 

tutorial% make 

cc -sun4 -c stack®.c 

cc -sun4 “C fifos.c 

ar’Jrv librpn.a stacks-o fifos.Q 

a' - stacks.o, 

a - fifos-o 

ranlib librpn.a 

- > 


The name of an ar library member cannot exceed 15 characters. If a filename is 
longer than that, ar tmncates the name of its corresponding member to the first 
15 characters. If a library depends upon a member whose corresponding 
filename is too long, make attempts to match the name of the member to the first 
15 characters of a file in the directory, make uses the first filename that matches 
as the file from which to build the member. 

. PRECIOUS: Preserving Normally, if you interrupt make in the middle of a target, the target file is 

Libraries Against Removal Due removed. For individual files this is a good thing, otherwise incomplete files 

to Interrupts with brand new modification times might be left in the directory. For libraries, 

which consist of several members, the story is different. It is often better to leave 
the library intact, even if one of the members is still out of date. This is espe¬ 
cially tme for large libraries, especially since a subsequent make mn will pick up 
where the previous one left off—^by processing the object file or member whose 
processing was interrupted. 

.PRECIOUS is a special target that is used to indicate which files should be 
preserved against removal on interrupts; make does not remove targets that are 
listed as its dependencies. If you add the line: 

- --- 

.PRECIOUS: librpn.a 

V--- > 


Library Member Name-Length 
Limit 


to the makefile shown above, run make, and interrupt the processing of 
librpn. a, the library is preserved. 

The $% dynamic macro,is provided specifically for use with libraries. When a 
libraiy member is the target, the member name is assigned to the $% macro. For 
instance, this makefile below produces the results that follow. 
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8.4. Maintaining Programs 
and Libraries With 

make 


Using Macros for Added 
Flexibility 



In previous sections you have learned how make can help compile simple pro¬ 
grams and build simple libraries. The focus of this section is on developing 
makefiles for more complex compilations. To eliminate possible sources of con¬ 
fusion, it is often a good idea to put each module into a separate directory of its 
own. This makes clear which source files pertain to which programs or libraries, 
and allows you to create makefiles that operate consistently between various 
parts of a software project. Subsequent sections describe how to maintain, as a 
single entity, a project that spans several directories. 

You have seen how to use predefined and dynamic macros within rules, and for 
passing parameters from the command line, make also allows you to define your 
own macros within a makefile. Macros allow you to simplify makefiles while 
making them more flexible (for use with other modules, or other projects; 
makefiles for this version of make are not necessarily portable to other versions 
of Withmake). use of macros, you can develop template makefiles that can be 
re-used, with only minor edits, for any number of similar compilation pro¬ 
cedures. The examples to follow illustrate how to use macros to develop tem¬ 
plate makefiles for C programs and libraries. 

Macro definitions can appear on any line in a makefile; macros can be used to 
abbreviate long target lists or expressions, or as shorthand to replace long strings 
that would otherwise have to be repeated. Macro names are allocated as the 
makefile is read in; the value a particular macro reference takes depends upon the 
most recent value assigned.For instance, in the following makefile, the macro 
TEST evaluates to false. 

----- 

TEST= true 
TEST= false 

all: 

#echo $(TEST) 

' . ... > 


Actually, macro evaluation is a bit more complicated than this. Refer to Passing Parameters to Nested 
make Commands for more information. 
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Embedded Macro References Macro references can embedded within other references, like this:^^ 

$(OUTER$(INNER)) 


In which case they are expanded from innermost to outermost: 


r 


OUTER= out 


INNNER= in 


outin= something completely different 


all: 


@echo $(OUTER$(INNER)) 


V 

J 


produces: 



A More Flexible Makefile The makefile for compiling a C program that used implicit rules can be general¬ 

ized to accommodate other programs using macros. By replacing key words with 
macros, and by editing the definitions of those macros, altering the makefile for 

use with yet another program becomes a simple matter. 

- -- — - 

# Flexible makefile for a C program. 

SOURCES= main.c data.c 
OBJECTS= main.o data.o 
PROGRAM= program 

CFLAGS= -O CPPFLAGS= LDFLAGS= 

.KEEP_STATE: 

$ (PROGRAM): $(OBJECTS) $(LINK.c) -o $@ $(OBJECTS) 
clean: rm $(PROGRAM) $(OBJECTS) 

v---> 


In this case, you need only edit the SOURCES, OBJECTS and PROGRAM macros 
and you can compile a different program entirely, albeit in the same way. 

Although in a simple case like this the changes to the makefile might not seem 
worth the extra trouble, the added flexibility becomes increasingly important as 
you apply more powerful techniques. With judicious use of macros you can 
avoid having to puzzle over which specific changes you can, or should (or even 
dare), add to a makefile. 


Not supported in previous versions of make. 
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Makefiles as Specifications 


No one should have to scan an 
entire makefile just to puzzle out 
what it builds. 

Suffix Replacement in Macro 
References 


A makefile perfonns an important function by documenting what files get built 
from which sources, and what compilation options are used, by default, to build 
them. Specifying this information with a set of macro definitions at the top of 
the makefile is a great aid the reader, especially when makefiles are similar in 
format, or at all complicated. 

In the flexible makefile shown above, the value of OBJECTS is a bit redundant. 
It would be better to derive the names of the object files from the names of the 
source files. In fact, there are any number of filenames that can be derived from 
the names of source files, simply by altering their suffix. For this reason, make 
provides a mechanism for temporarily replacing suffixes of words in a macro’s 
value, when the reference to that macro is of the form:^^ 

$ (macro : old-suffix=new-suffix) 

This suffix replacement macro reference allows you to express the list of object 
files in terms of the list of sources; 

OBJECTS= $(SOURCES:.c=.o) 


It replaces all occurrences of the . c suffix in words within the value with the . o 
suffix. The substitution is not applied to words for that do not end in the suffix 
given. The following makefile: 


/ ----— 

OLD= main.c data.c moon 

NEW= $(OLD:.c=.o) 

all: 

@echo $(NEW) 

V 


illustrates this very simply: 

tutorial% make 


main.o data.o moon 


. 

—. V 


Using lint with make 


We encourage you to lint your C 
programs for easier debugging and 
maintenance, lint also checks for 
C constructs that are not con¬ 
sidered portable across machine 
architectures. It can be a real help 
in writing portable C programs. 


lint, the C program verifier,^^ is an important tool for forestalling the kinds of 
bugs that are most difficult and tedious to track down. These include uninitial¬ 
ized pointers, parameter-count mismatches in function calls, and nonportable 
uses of C constructs. As with the clean target, lint is a target name used by 
convention; it is usually a good practice to include it in makefiles that build C 


“ Although conventional suffixes start with dots, a suffix may consist of any string of characters. 
See Using lint in lint —a Program Verifier for C for more informatioit 
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programs, lint produces output files that have been preprocessed through cpp 
and its own first (parsing) pass. These files characteristically end in the . In 
suffix, 24 and can also be derived from the list of sources through suffix replace¬ 
ment: 

-\ 

LINTFILES= $(SOURCES:.c=.In) 

V_^^_/ 


The lint target entry appears as follows: 


r 


lint: $(LINTFILES) 


$(LINTFILES): 


$(LINT.C) $(LINTFILES) 



J 


There is an implicit rule for building each . In file from its corresponding . c 
file, so there is no need for target entries for the . In files. As sources change, 
the . In files are updated whenever you run 

make lint 

Since the LINT. c predefined macro includes a reference to the LINTFLAGS 
macro, it is a good idea to specify the lint options to use by default (none in 
this case). Since lint entails the use of cpp, it is a good idea to use 
CPPFLAGS, rather than CFLAGS for compilation preprocessing options (such as 
-I). The LINT . c macro does not include a reference to CFLAGS. 

Also, when you run make clean you will want to get rid of any . In files pro¬ 
duced by this target. It is a simple enough matter to add another such macro 
reference to the clean target: 

-——-\ 

clean: 

rm -f $ (PROGRAM) $(OBJECTS) $(LINTFILES) 
---> 


With these changes, the new version of the makefile appears as follows. 


^ This is true for the Sun implementation, it may not be tme for other versions of lint. 
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Figure 8-8 Makefile with ' ‘Suffix-Replacement’ ’ Macro References 


# Makefile for a C program with an entry for lint. 

SOURCES= main.c data.c 
PROGRAM= program 

CFLAGS= -0 
CPPFLAGS= 

LDFLAGS= 

LINTFLAGS= 

OBJECTS= $(SOURCES:.c=.o) 

LINTFILES= $(SOURCES:.c=.In) 

.KEEP_STATE: 

$(PROGRAM): $(OBJECTS) 

$(LINK.c) -o $8 $(OBJECTS) 

lint: $(LINTFILES) 

$(LINTFILES): 

$(LINT.c) $(LINTFILES) 

clean: 

rm -f $(PROGRAM) $(OBJECTS) $(LINTFILES) 


Linking With System- This makefile is easily altered to compile a program that uses system-supplied 

Supplied Libraries library packages. The next example shows a makefile that compiles a program 

that uses the curses and t ermlib library packages for screen-oriented cursor 
motion. 

You can also link with a library by A makefile link with user-supplied libraries appears later on. 
specifying its pathname name as an 
argument to cc. 
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Figure 8-9 Makefile for a C Program With System-Supplied Libraries 

f -\ 

# 0(#) sample.l.mk 

# 

# Makefile for a C program with curses and termlib. 

SOURCES= main.c data.c 
LXBS= -Icurses -Iterxnllb 
PROGRAM= program 

CFLAGS= -0 
CPPFLAGS= 

LDFLAGS= 

LINTFLAGS= 

OBJECTS= $(SOURCES;.c=.o) 

LINTFILES= $(SOURCES:.c=.ln) 

.KEEP_STATE: 

$(PROGRAM): $(OBJECTS) 

$(LINK.c) -o $6 $(OBJECTS) $(LIBS) 

lint: $(LINTFILES) 

$(LINTFILES): 

$(LINT.c) $(LINTFILES) 

clean: 

rm -f $(PROGRAM) $(OBJECTS) $(LINTFILES) 


Since the link editor resolves undefined symbols as they are encountered, it is 
normally a good idea to place library references at the end of the list of files to 
link. 

This makefile produces: 



tutorial% make 




cc -0 

-sun4 “C 




CC 

'"SUn4 -c 




cc -0 

-sun4- -o 

program main.o data.© -Icurses -Itermlib . 








Compiling Programs for Compiling programs for debugging or profiling introduces a new twist to the pro- 

Debugging and Profiling cedure, and to the makefile. These variants are produced from the same source 

code, but are built with different options to the C compiler. The cc option to 
produce object code that is suitable for debugging is -g, and it is important to 
omit the -0 option in this case. The cc options that produce code for profiling 
are -0 and -pg. 
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Sinc6 th© compilation procedure is the same otherwise, you could give make a 
definition for CFLAGS on the command line. Since this definition overrides the 
definition in the makefile, and . KEEP_STATE assures any command lines 
affected by the change are performed, the command: 

make "CFLA6S= -0 -pg" 


produces the following results. 


/-- -—------ 

tutorial! make ”CFIAGS= -O -pg" 


cc 

-0 -pg 

-sun4 -c main,c 


cc. 

-0 -pg 

-Sun4 ~c data.c 


cc 

-0 -pg 

“sun4 -o program main.o data.o -Icurses -Itermlib 

j 


Of course, you may not want to memorize these options or type a complicated 
command like this, especially when you can put this information in the makefile. 
What is needed is a way to tell make how to produce a debugging or profiling 
variant, and some instmetions in the makefile that tell it how. One way to do this 
might be to add two new target entries, one named debug, and the other named 
with the proper compiler options hard-coded into the command line. 

A better way would be to add these targets, but rather than hard-coding their 
mles, include instmetions to alter the definition of CFLAGS depending upon 
which target it starts with. Then, by making each one depend on the existing tar¬ 
get for program make could simply make use of its rale, along with the 
specified options. 

Instead of saying "make "CFLAGS= -g", you could say "make debug" to 
compile a variant for debugging. The question is, how do you tell make that you 
want a macro defined one way for one target (and its dependencies), and another 
way for a different target? 


Conditional Macro Definitions 


A conditional macro definition^ 
is a line of the form: 

target-list : = macro = value 


make must know which targets the 
definition applies to, so you can’t 
use a conditional macro definition to 
alter a target name. 


which assigns the given value to the indicated macro while make is processing 
the target named target-name and its dependencies. The following lines give 
CFLAGS an appropriate value for processing each program variant. 


/ ---—- 


debug := CFLAGS= -g 


profile := CFLAGS= -pg -0 



-/ 


^ Not available with previous versions of make. 
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Compiling Debugging and 
Profiling Variants 


Figure 8-10 


all is a conventional target for 
building "all" final, or "finished" tar¬ 
gets. Debugging and profiling vari¬ 
ants aren’t normally considered part 
of a finished program. 


Note that when you use a reference to a condition macro in the dependency list 
that reference must be delayed (by prepending a second $). Otherwise, make 
may expand the reference before the correct value has been assigned. When it 
encounters a (possibly) incorrect reference of this sort, make issues a warning. 

The following makefile produces optimized, debugging, or profiling variants of a 
C program, depending on which target you specify (the default is the optimized 
variant). Command dependency checking guarantees that the program and its 
object files will be recompiled whenever you switch between variants. 

Makefile for a C Program with Alternate Debugging and Profiling Variants 

-^ 

# G(#) sample.2.mk 

# 

# Makefile for a C program with alternate 

# debugging and profiling variants. 

SOURCES= main.c data.c 
LIBS= -Icurses -Itermlib 
PROGRAM= program 

CFLAGS-0 

CPPFLAGS= 

LDFLAGS= 

LINTFLAGS= 

OBJECTS= $(SOURCES:.c=.o) 

LINTFILES= $(SOURCES: .C=. In) 

.KEEP_STATE: 

all debug profile: $(PROGRAM) 

debug := CFIjAGS= -g 
profile := CFIiAGS= -pg -O 

$(PROGRAM): $(OBJECTS) 

$(LINK.c) -O $@ $(OBJECTS) $(LIBS) 

lint: $(LINTFILES) 

$(LINTFILES): 

$(LINT.c) $(LINTFILES) 

clean: 

rm -f $(PROGRAM) $(OBJECTS) $(LINTFILES) 
_^_> 


Going through the makefile, all of the lines above . KEEP_STATE seem fami¬ 
liar. The subsequent target entry specifies three targets, with all appearing first 

all traditionally appears as the first target in makefiles with alternate starting 
targets (or those that process a list of targets). It’s dependencies are "all" targets 
that go into the final build, whatever that may be. In this case, the final target is 
the optimized program variant. This entiy also indicates that debug and pro¬ 
file depend on program (the value of $ (PROGRAM)). 
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The next two lines contain conditional macro definitions for CFLAGS, when it 
appears in profile or debug, or their dependencies: 


-—-- - 


debug := CFLAGS= -g 


profile := CFLAGS= -pg -0 





Next comes the familiar target entry that starts with $ (PROGRAM) . Finally, the 
remainder of the makefile looks familiar. 

With this makefile, 

make 


or 


make all 


produces: 



tutorial% make 



cc ’"O 

-sun4 "C main.c 



cc “0 

-'surx4 -c data.c 



CC '-O 

-sun4 -o program main*o data.c --Icurses --Itermlib 







make debug 
produces: 

( -—-—— .. . N 

tutoriai% make debug 
cc -g '“sun4 -c main^c 
cc --g -sun4 -c data.c 

cc “g -sun4 -o program main.o data.o -Icuraes --Itermlib 

- 

and 

make profile 


produces: 


r 

tutoriai% 

make profile 


cc “pg “0 

-sun4 “C main.G 


cc “pg -0 

-sun4 -c data.c 


cc “pg “0 

“sun4 -o program main.o data.o -Icurses -Itermlib 




J 


The next example applies similar techniques to maintaining a C object library. 
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Figure 8-11 Makefile for a C Library with Alternate Variants 

--- 

# @(#) sample.3.mk 

# 

# Makefile for a C library with alternate 

# variants. 

SOURCES= calc.c map.c draw.c 
LIBRARY= libpkg.a 

CFLAGS= -0 
CPPFLAGS= 

LINTFLAGS= 

MEMBERS= $(SOURCES:.C=.o) 

LINTFILES= $(SOURCES:.c=.In) 

all debug profile: $(LIBRARY) 

debug := CFLAGS= -g 
profile := CFLAGS= -pg -0 

.KEEP__STATE: 

.PRECIOUS: $(LIBRARY) 

$(LIBRARY): $(LIBRARY)($(MEMBERS)) 
ar rv $@ $? 
ranlib $@ 

$(LIBRARY)(%.o): %.o 

lint: $(LINTFILES) 

$(LINTFILES): 

$(LINT.c) $(LINTFILES) 

clean: 

rm -f $(LIBRARY) $(MEMBERS) $(LINTFILES) 


Maintaining Separate The previous two examples are adequate when development, debugging and 

Program and Library profiling are done in distinct phases. However they suffer from the drawback 

Variants that all object files are recompiled whenever you switch between variants, which 

can result in unnecessary delays. The next two examples illustrate how all three 
variants can be maintained as separate entities. 

To avoid the confusion that might result from having three variants of each 
object file in the same directory as the program sources, it makes sense to place 
the debugging and profiling objects and executables in their own subdirectories. 
However, in order to do this we need a technique for adding a the name of the 
subdirectory as a prefix to each entry in the list of object files. 
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Pattern Replacement Macro 
References 



A pattern replacement macro reference is similar in form an function to a siuffix 
replacement referenceYou can use a pattern replacement reference to add or 
alter a prefix, suffix, or both, to matching words in the value of a macro. A pat¬ 
tern replacement reference takes the form: 

$ {macro:p =np %ns ) 

where/? is the existing prefix to replace (if any), 5 is the existing suffix to replace 
(if any), np and ns are the new prefix and suffix, respectively, and % is a wild card 
character that matches zero or more characters in each word. The pattern 
replacement is applied to all words in the value that match. For instance, this 
makefile: 


f 




OLD= old.main.c old.data.c moon 

NEW= $(OLD:old.%.c=new.%.o) 



all: 



@echo $(NEW) 






produces: 

( --- —— - ; - 

tutorial% make 

new.main.o new.data.o moon 

Please note, however, that pattern replacement macro references should not 
appear on the dependency line of a pattern matching rule’s target entry. This 
produces unexpected results. With the makefile: 

r - —---- 

OBJECT= .0 
x: 

X. Z: 

@echo correct 
%: %.$(OBJECT:%o=%Z) 

--- ---- ^ 


it looks as if make should attempt to build x from x. Z. However, the pattern 
matching rule is not recognized; make cannot determine which of the % charac¬ 
ters in the dependency line to use in the pattern matching rule; consequently, the 
target entry for x. Z is never reached. 


^-- -- 

tutorial% make 


tutorial% 


^ . . . 




^ As with pattern matching rules, pattern matching macro references aren’t available in earlier versions of 

make. 
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Makefile for a Program with 
Separate Variants 

make performs the rule in the 
. INIT target just after the makefile 
is read. 


The following example shows a makefile for a C program with separately- 
maintained variants. First, the . INIT special target, creates the debug and 
profile subdirectories (if they don’t already exist), which will contain the 
debugging and profiling object files and executables. 

Next, the macros DEBUG and PROFILE are assigned the program name, prefixed 
with either debug/ or profile/, as appropriate. Pattern replacement macro 
references to the PROGRAM macro are used to accomplish this. Next, the debug 
and profile targets are set to depend on them so that when you type make 
debug, instead of recompiling program, with different compiler options, 
make builds the file debug/program. 

These variant executables are made to depend on the object files listed in the 
VARIANTS . o macro. This macro is given the value of OBJECTS by default; 
later on it may be reassigned using a conditional macro definition, at which time 
either the debug/ or profile/ prefix is added, as appropriate, to each entry 
in the list of object files; executables in the subdirectories depend on the object 
files that are built in those same directories. 


Next, pattern matching mles are added to indicate that the object files in both 
subdirectories depend upon source (. c) files in the working directory. This is 
the key step needed to allow all three variants to be built and maintained from a 
single set of source files. 


Here is an exception to the advice 
that a makefile should only maintain 
files in the current working directory. 
Still, target files should only be built 
in a subdirectory if they depend on 
source files in the working directory. 


Finally, the clean target has been updated to recursively remove the debug 
and profile subdirectories and their contents, which should be regarded as 
temporary. This helps to impose the practice of keeping all files that are critical 
to the program in the same directory as its source files, and not in the subdirec¬ 
tories for the variants. 
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Figure 8-12 Makefile for Separate Debugging and Profiling Program Variants 

# @(#) sample.4.mk 

# 

# Makefile for maintaining separate debugging and 

# profiling program variants. 

SOURCES= main.c data.c 
LIBS= -Icurses -Itermlib 
PROGRAM= program 

CFLAGS= -0 
CPPFLAGS= 

LDFLAGS= 

LINTFLAGS= 

0BJECTS= $(SOURCES:.c=.o) 

LINTFILES= $(SOURCES:.c=.In) 

DEBUG= $(PROGRAM:%=debug/%) 

PROFILES $(PROGRAM:%=profile/%) 

VARIANTS.0= $(OBJECTS) 

.KEEP_STATE: 

.INIT: 

-mkdir profile debug 

all: $(PROGRAM) 
debug: $(DEBUG) 
profile: $(PROFILE) 
variants: debug profile 

$(DEBUG) := CFLAGS= -g 

$(PROFILE) := CFLAGSs -pg -O 

$(DEBUG) := VARIANTS.o= $(OBJECTS:%=debug/%) 

$(PROFILE) := VARIANTS.o= $(OBJECTS:%=profile/%) 

$(PROGRAM) $(DEBUG) $(PROFILE): $$(VARIANTS.o) 

$(LINK.C) -O $@ $(VARIANTS.O) $(LIBS) 

profile/%.o debug/%.o: %.c 

$(COMPILE.c) -o $@ $< 

lint: $(LINTFILES) 

$(LINTFILES): 

$(LINT.c) $(LINTFILES) 

clean: 

rm -rf $(PROGRAM) $(OBJECTS) $(LINTFILES) debug profile 


Notice that the all target has not been made to depend on the debugging and 
profiling variants. This is because they are not normally part of final production 
build, so they aren’t included in the conventional meaning for all. However, if 
you want to build all three variants it is a simple matter to give the command: 


A sun 
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make all variants 

The modifications for separate library variants are quite similar. First, the new 
macros DEBUG and PROFILE are assigned the library name with the proper sub¬ 
directory prefix. VARIANTS. o is assigned the value of MEMBERS by default, 
and conditionally defined for the debugging and profiling targets. Then, the 
. INIT target is given so that the subdirectories are created (if not already 
present). Then, the target entry for the library is altered to include all three vari¬ 
ants. Next, pattern matching rules are added to specify the dependence of the 
variant libraries in the respective subdirectories, the variant object files in those 
same directories. Other pattern matching rules specify the dependence of those 
object files on source files in the current working directory. 

Finally, the clean target is modified to recursively remove the variant subdirec¬ 
tories. 
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Makefile for a Library with 
Separate Variants 




Figure 8-13 Makefile for Separate Debugging and Profiling Library Variants 


# 0(#) sample.5.mk 

# 

# Makefile for maintaining' separate library 

# variants. 

SOURCES= calc.c map.c draw.c 
LIBRARY= libpkg.a 

CFLAGS= -O 
CPPFLAGS= 

LINTFLAGS= 

MEMBERS= $(SOURCES:.c=.o) 

LINTFILES= $(SOURCES:.c=.ln) 

DEBUG= $(LIBRARY:%=debug/%) 

PROFILE= $(LIBRARY:%=profile/%) 

VARIANTS.0= $(MEMBERS) 

.KEEP_STATE: 

.PRECIOUS: $(LIBRARY) 

.INIT: 

-mkdir profile debug 

all: $(LIBRARY) 
debug: $(DEBUG) 
profile: $(PROFILE) 
variants: debug profile 

debug := CFLAGS= -g 
profile := CFLAGS= -pg -0 

$(DEBUG) := VARIANTS.O = $(MEMBERS:%=debug/%) 

$(PROFILE) := VARIANTS.O = $(MEMBERS:%=profile/%) 

$(LIBRARY) $(DEBUG) $(PROFILE): $$(VARIANTS.o) 
ar rv $@ $? 
ranlib $@ 
rm -f $? 

$(LIBRARY)(%.o): %.o 

$(DEBUG)(debug/%.o): debug/%.o 
$(PROFILE)(profile/%.o): profile/%.o 
profile/%.o debug/%.o: %.c 

$(COMPILE.c) -o $@ $< 

lint: $(LINTFILES) 

$(LINTFILES): 

$(LINT.c) $(LINTFILES) 

clean: 

rm -rf $(LIBRARY) $(MEMBERS) $(LINTFILES) debug profile 
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Here the command: 

make all variants 

produces: 

-^^^ 

tutorial% make all variants 

cc “0 “Sund “C calcic 

cc -0 ~5un4 “C map-c 

oc *0 *-sun4 ‘“C drawee 

ar rv libpkg.a calc.o map.o drawee 

ar; creating libpkg.a 

ranlib libpkg.a 

rm -f calc.o map.o draw.o 

cc --g -sun4 -c -o debug/calc,o calc.c 

cc -g ~sun4 -o -o dabug/mapwO map.c 

cc -g ~sun4 -c -o debug/draw.o draw.c 

ar rv debug/libpkg.a debug/calc.o debug/map.o debug/draw,o 

a - debug/calc.o 

a ~ debug/raapwO 

a - debug/draw,o 

ar: creating debug/libpkg.a 

ranlib debug/libpkg.a 

rm -f debug/calc *0 debug/map.o debug/draw.o 
cc -pg -0 ~sun4 -c -o profile/calc.o calc.c 

cc -pg -O -5un4 -c -o profile/map.o map.c 

cc -pg -O “Sun4 -c -o prefile/draw.o draw.c 

ar rv profile/libpkg,a profile/calc.o profile/map.o profile/draw.o 

a profile/calc.o 

a prof ile/map. o 

a - profile/draw.o 

ar: creating profile/libpkg.a 

ranlib profile/libpkg.a 

rm -f profile/calc.o profile/map.o profile/draw.o 

While an interesting and useful compilation technique, this method for maintain¬ 
ing separate variants is a bit complicated. For clarity’s sake it is omitted from 
subsequent examples. 

Maintaining a Directory of The makefile for maintaining an include directory of header files is really 

Header Files quite simple. Since header files consist of plain text, all that is needed is a target, 

all, that lists them all as dependencies. Automatic SCCS extraction takes care 
of the rest. If you use a macro for the list of header files, this same list can be 
used in other target entries, which may be added later for project management 
purposes. 
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# Makefile for maintaining an include directory. 
FILES.h= calc.h map.h draw.h 
all: $(FILES.h) 
clean: 

rm -f $(FILES.h) 


This same technique can be applied to other files that do not require compilation 
or other such processing (such as man command document source files). 

Compiling and Linking With When preparing your own library packages, it often makes sense to treat each 

Your Own Libraries library as a separate entity from programs that use it, as well as the header files 

used by both. Separating programs, libraries and header files into distinct direc¬ 
tories often makes it easier to prepare makefiles for each type of module. And, it 
clarifies the structure of a software project. 

A courteous and necessary convention of makefiles is that they only build files in 
the working directory, or in temporary subdirectories. Unless you are using 
make specifically to install files into a specific directory on an agreed-upon file 
system, it is regarded as very poor form for a makefile to produce output in 
another directory. 

Building programs that rely on user-supplied libraries in other directories adds 
several new wrinkles to the makefile. Up until now, everything needed has been 
in the directory, or else in one of the standard directories that are presumed to be 
stable. This is not tme for user-supplied libraries that are part of a project under 
development, especially when their contents are subject to change. 

More importantly, since these libraries aren’t built automatically (there is no 
equivalent to automatic SCCS extraction for them), there must be an explicit tar¬ 
get entry to build them. So, a problem arises until such time as the library has 
been completed tested and can be presumed to be stable. 

On the one hand, you need to assure the libraries you link with are up to date. 

On the other hand, you need to observe the convention that a makefile should 
only maintain files in the local directory. In addition, the makefile should not 
contain duplicate information that could get out of sync with a makefile in 
another directory. The whole purpose of make, after all, is to provide consistent, 
modular processing. 

Nested make Commands The solution is to use a nested make command, r unnin g in the directory the 

libraiy resides in, to rebuild it (according to the target entry in the makefile 
there). 


It is not a good idea to have things 
pop up all over the file system as a 
result of running make. 
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The MAKE macro, which is set to the 
value “make” in the default file, 
overrides the -n option. Any com¬ 
mand line in which it is referred to is 
executed, even though -n may be 
in effect. Since this macro is used 
to invoke make, and since the make 
it invokes inherits -n from the spe¬ 
cial MAKEFLAGS macro, make can 
trace a hierarchy of nested make 
commands with the -n option. 


-—— ^ 

# First cut entry for target in another 

# directory. 

../lib/libpkg.a: 

cd ../lib ; $(MAKE) libpkg.a 

^^ 


The library is specified with a pathname relative to the current directory. In gen¬ 
eral, it is better to use relative pathnames. If the project is moved to a new root 
directory or machine, so long as its stmcture remains the same relative to that 
new root directory, all the target entries will still point to the proper files. 

Within the nested make command line, the dynamic macro modifiers F and D 
come in handy, as does the MAKE predefined macro. If the target being pro¬ 
cessed is in the form of a pathname, $ (@F) indicates the filename part, while 
$ (@D) indicates the directory part. If there are no / characters in the target 
name, then $ (@D) is assigned the dot character (.) as its value. 


The target entry can be rewritten as: 




# Second cut. 


. ./lib/libpkg.a: 


cd $ (SD); $(MAKE) $(@F) 


<_ 

J 


Forcing A Nested make Because it has no dependencies, this target will only run when the file named 

Command to Run . . /lib/libpkg . a is missing. If the file is a library archive protected by 

. PRECIOUS, this could be a rare occurrence. The current make invocation nei¬ 
ther knows nor cares about what that file depends on, nor should it. It is the 
nested invocation that decides whether and how to rebuild that file. After all, just 
because a file is present in the file system doesn’t mean that it is up to date. This 
means that you have to force the nested make to run, regardless of the file’s pres¬ 
ence, by making it depend on a target with a null rule: 


f 


N 

# Reliable target entry 

for a nested make 


# command. 



../lib/libpkg.a: FORCE 

cd $(@D); $(MAKE) 

FORCE: 

$ (@F) 


V 


J 


In this way, make reliably cd’s to the directory . . / lib and builds libpkg. a 
if necessary, using instructions from the makefile found in that directory 
as. ./lib). 
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tutorial% make ../lib/libpkg.a 
These lines are produced by the ../lib; make libpkg.a 

nested make run. make libpkg.a 

'libpkg.a' is up to date. 


The following makefile uses a nested make command to process local libraries 
that a program depends on. 

Figure 8-14 Makefile for C Program With User-Supplied Libraries 


# @(#) sample.6.mk 

# 

# Makefile for a C program with user-supplied 

# libraries and nested make commands. 

SOURCES= main.c data.c 
ULIBS= ../lib/libpkg.a 
SLIBS= -Icurses -Itermlib 
PROGRAM= program 

CFLAGS= -0 
CPPFLAGS= 

LDFLAGS= 

LINTFLAGS= 

OBJECTS= $(SOURCES:.c=.o) 

LINTFILES= $(SOURCES:.c=.In) 

.KEEP_STATE: 

all debug profile: $(PROGRAM) 

debug := CFLAGS= -g 
profile := CFLAGS= -pg -0 

$(PROGRAM): $(OBJECTS) $(ULIBS) 

$(LINK.c) -o $@ $(OBJECTS) $ (ULIBS) $(SLIBS) 

$(ULIBS): FORCE 

Cd$(@D>; $(MAKE) $(@F) 

FORCE: 

lint: $(LINTFILES) 

$(LINTFILES): 

$(LINT.c) $(LINTFILES) 

clean: 

rm -f $(PROGRAM) $(OBJECTS) $(LINTFILES) 
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When . . /lib/libpkg. a is up to date, this makefile produces: 


r— -- ' - 

tutorial% make 
CO -O '“Sun4 main^c 

• cc -O ■^sun4 data^c 

cd -.^/lib; make libpkg,a 
'libpkg.a' is up to date, 

CO -O -sun4 -o program main^o data-.o /lib/libpkg;a -Icurses -1 tejnmlib 

C .. ■ . . ■ ■ -- ----- J 


The MAKEFLAGS Macro 

Do not define MAKEFLAGS in your 
makefiles. 


Macro Definitions and 
Environment Variables: 
Passing Parameters to Nested 
make Commands 


Like the MAKE macro, MAKEFLAGS is also a special case. As its name suggests, 
it contains flags (that is, single-character options) for the make command. 

Unlike other FLAGS macros, the MAKEFLAGS value is a concatenation of flags, 
without a leading For instance the string, eiknp would be a recognized 
value for MAKEFLAGS, while, ‘-f x. mk’ or ‘macro=value’ would not. 

If the MAKEFLAGS environment variable is set, make runs with the combination 
of flags given on the command line and contained in that variable. 

The value of MAKEFLAGS is always exported, whether set in the environment or 
not, and the options it contains are passed to any nested make commands 
(whether invoked by $ (MAKE), make or /usr/bin/make). This insures you 
that nested make commands are always passed the options that the parent make 
was invoked with. Because MAKEFLAGS is maintained automatically, defining 
it in the makefile would only be misleading. 

With the exception of MAKEFLAGS,^^ make imports variables from the environ¬ 
ment and treats them as if they were defined macros. In turn, make propagates 
those environment variables and their values to commands it invokes, including 
nested make commands. Macros can also be defined as command line argu¬ 
ments, as well as the makefile. This can lead to name-value conflicts when a 
macro is defined in more than one place, and so, make has a fairly complicated 
precedence rule for resolving them. 

First of all, conditional macro definitions always take effect within the targets 
(and their dependencies) for which they are defined. 

If make is invoked with a macro-definition argument, that definition takes pre¬ 
cedence over definitions given either within the makefile, or imported from the 
environment. (This does not necessarily hold true for nested make commands, 
however.) Otherwise, if you define (or redefine) a macro within the makefile, the 
most recent definition applies. The latest definition normally overrides the 
environment. Lastly, if the macro is defined in the default file and nowhere else, 
that value is used. 

The -e option alters this scheme. With -e, macros defined in the environment 
override any and all makefile definitions (but not the command line). 


^ and SHELL. The SHELL environment variable is neither imported nor exported in this version of make. 
See inake(l) in the SunOS Reference Manual^ for more information about the SHELL macro. 



Revision A of 9 May 1988 



Chapter 8 — make User’s Guide 169 


With nested make commands, definitions made in the makefile normally over¬ 
ride the environment, but only for the makefile in which each definition occurs; 
the value of the corresponding environment variable is propagated regardless. 
Command-line definitions override both environment and makefile definitions, 
but only for the topmost make run. Although values from the command line are 
propagated to nested make commands, they are overridden both by definitions in 
the nested makefiles, and by environment variables imported by the nested make 
commands. 

The -e option behaves more consistently. The environment overrides macro 
definitions made in any makefile, and command-line definitions are always used 
ahead of definitions in the makefile and the environment. One drawback to -e is 
that it introduces a situation in which information that is not contained in the 
makefile can be critical to the success or failure of a build. 

This is an awful lot to remember, so a good rule of thumb when passing parame¬ 
ters to nested mak© commands is: supply them as command-line definitions, and 
use -e. However, before you run make with the -e option, it is important to 
eliminate all extraneous or improperly defined environment variables, since 
make -e will propagate whatever is in the environment to the entire hierarchy 
of nested make commands: 

make -e CFLAGS=-E 

Environment variables don’t go away when you’re done with them (i.e, they stay 
around to haunt you, especially when you attempt to build something else with 
make later on). One way to avoid lingering environment variables is to invoke 
make within a subshell. When you set environment variables and ran make in 
the subshell, their values are isolated within that subshell and any processes it 
spawns (like the one for make): 

( setenv CFLAGS -E ; make -e ) 

This next example illustrates the difference in parameters between the top make 
ran and the nested make runs, using the two makefiles shown below. 


# top.mk 

MACRO= "Correct if unexpected.” 
top: 

@echo "- 

echo $(MACRO) 

@echo ”- 

$(MAKE) -f nested.mk 

@echo ”-- 

Clean: 

rm nested 


top" 


clean” 
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and: 


r 

A 

# nested.mk 


MACRO=nested 


nested: 


@echo ” --- 

touch nested 
echo $(MACRO) 

$(MAKE) -f top.mk 
$(MAKE) -f top.mk clean 

- nested” 

_—- 

.-.-.-. - ^ 


With these makefiles, the command: 
make -f top.mk MACRO=top 


produces the results that follow. 


ttitorial% inake 

hOp.mfc MACRO-tOp 

■■ . "N 

echo top 
top 




make -f nested.mk 


nested 

top 


touch nested 
echo nested 
nested 

make --f top.mk 



echo "Correctr if unexpected." 
Correctr if unexpected. 


make --f nested.mk 
'nested' is up to date, 
make -f top.mk clean 

rm nested 

V.___ 

clean 



This pair of makefiles can be helpful if you decide to review the various cases 
yourself. 
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Table 8-3 Summary of Macro Assignment Order 


Without -e 

With -e in effect 

top-level make command: 

conditional definitions 
make command line 
latest makefile definition 
environment value 
predefined value, if any 

conditional definitions 
make command line 
environment value 
latest makefile definition 
predefined value, if any 

nested make commands: 

conditional definitions 
make command line 
latest makefile definition 
environment variable 
predefined value, if any 
parent make cmd. line 

conditional definitions 
make command line 
parent make cmd. line 
environment value 
latest makefile definition 
predefined value, if any 


Compiling Other Source Files The following examples illustrate the use of make to maintain C programs that 

contain assembly routines, and programs produced with lex and yacc. 


Compiling and Linking a C 
Program with Assembly 
Language Routines 


ASFLAGS passes options for as to 
the . s . o and . s. o implicit rules. 


The makefile in the next example maintains a program with C source files linked 
with assembly language routines.^* There are two varieties of assembly source 
files, those that contain cpp preprocessor directives, and those that don’t. By 
convention, assembly source files without preprocessor directives have the . s 
suffix. Assembly sources that require preprocessing have the . S suffix. 

Assembly sources are assembled to form object files in a fashion similar to that 
used to compile C sources. The object files can then be linked into a C program, 
make has implicit mles for transforming . s and . S files into object files, so at a 
minimum, a target entry for a C program with assembly routines need only 
specify how to link the object files. You can use the familiar cc command to 
link object files produced by the assembler: 

-- 

CFLAGS= -0 
ASFLAGS= -0 

.KEEP_STATE: 

driver: c_driver.o s_routines.o S_routines.o 

cc -o driver c_driver.o s routines.o S routines.o 


The next example shows a more flexible makefile for this sort of compilation. 


^ Refer to the Assembly Reference Manual for more information about assembly language source files. 
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Figure 8-15 Makefile for a C Program with Assembly Routines 


# @(#) sample.7.mk 

# 

# Makefile for a C program linked with assembly routines. 

SOURCES.c= c_driver.c 
SOURCES . s= s__routines . s 
SOURCES. S= S__routines.S 

ULIBS= 

SLIBS= 

PROGRAM= driver 

ASFIiAGS= 

CFLAGS= “0 
CPPFLAGS= 

LDFLAGS= 

LINTFLAGS= 

OBJECTS= $(SOURCES.c:.c=.o) $(SOURCES.s:.s=.o) $(SOURCES.SS=.o) 
LINTFILES= $(SOURCES.c:.c=.In) # not for assembly sources 

.KEEP_STATE: 

all debug profile: $(PROGRAM) 

debug := CFLAGS= -g 
profile := CFLAGS= -pg -0 

$(PROGRAM): $(OBJECTS) $(ULIBS) 

$(LINK.c) ~o $0 $(OBJECTS) $(ULIBS) $(SLIBS) 

$(ULIBS): FORCE 

Cd $(@D); $(MAKE) $(@F) 

FORCE: 

lint: $(LINTFILES) 

$(LINTFILES): 

$(LINT.c) $(LINTFILES) 

clean: 

rm -f $(PROGRAM) $(OBJECTS) $(LINTFILES) 


This makefile compiles the executable program driver as shown: 

- 

tutot:ial% itiake 

CO -0 -sun4 -c c_driver*G 

as -"sund -o s_rautines.a s_rout;Lnes,s 

cc “Sun4 --o Sproutines.o S_routines.S 

cc -O -sun4 -0 driver c^^driver.o s_routines,o S_routines.o 


Note that the . S files are processed using the cc command, which invokes the C 
preprocessor cpp, and invokes the assembler implicitly. 
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lex and yacc produce C source files as output. Source files for lex end in the 
suffix . 1, while those for yacc end in . y. When used separately, the compila¬ 
tion process for each is similar to that used to produce programs from C sources 
alone. There are implicit rules for compiling the lex or yacc sources into . c 
files; from there the files are further processed with the implicit rules for compil¬ 
ing object files from C sources. When these source files contain no tinclude 
statements, there is no need to keep the c file, which in this simple case serves as 
an intermediate file. In this case one could use . 1. o rule, or the . y. o rule, 
respectively, to produce the object files, and remove the (derived) . c files. For 
example, the makefile: 



Things get to be a bit more complicated when you use lex and yacc in combi¬ 
nation. In order for the object files to work together properly, the C code from 
lex must include a header file produced by yacc. So, it may be necessary to 
recompile the C source file produced by lex when the yacc source file 
changes. In this case, it is better to retain the . c (intermediate) files produces by 
lex, as well as the additional . h file that yacc provides, so as to avoid r unnin g 
lex whenever the yacc source changes. 
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The following makefile maintains a program built from a lex source, a yacc 
source, and a C source file. 


yacc produces output files named 
y .tab.c and y .tab.h. If you 
want the output files to have the 
same basename as the source file, 
you must rename them. 


- 

CFLAGS= -0 

.KEEP_STATE: 

a2z: c_functions.o scanner.o parser.o 

cc -o $0 c_functions.o scanner.o parser.o 

scanner.c: 

parser.c + parser.h: parser.y 
yacc “d parser.y 
mv y.tab.c parser.c 
mv y.tab.h parser.h 

k_ J 


Since there is no transitive closure for implicit rules, you must supply a target 
entry for scanner . c. This entry bridges the gap between the . 1. c implicit 
rule and the . c. o implicit rule, so that the dependency list for scanner. o 
extends to scanner. 1. Since there is no rule in the target entry, scanner . c 
is built using the . 1. c implicit rule. 

The next target entry describes how to produce the yacc intermediate files. 
Because there is no implicit rule for producing both the header file and the C 
source file using yacc -d, a target entry must be supplied that includes a rule 
for doing so. 


Specifying Target Groups With In the target entry for parser . c and parser . h, the + sign separating the tar- 

the + Sign get names indicates that the entry is for a target group?^ A target group is a set 

of files, all of which are produced when the rule is performed. Taken as a group, 
the set of files is what comprises the target. Without the + sign, each item listed 
would comprise a separate target. With a target group, make checks the 
modification dates separately against each target file, but performs the target’s 
rule only once, if necessary, per make run. 

The next example shows a makefile for the more general case of a lex source, a 
yacc source, and any number of C source files. 


^ Not available with earlier versions of make. 
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Figure 8-16 Makefile for Compiling C Programs With lex and yacc Sources 

# @(#) sample.8.mk 

# 

# Makefile to compile a C program with lex and yacc sources. 

SOURCES.c= c_f unctions.c 
LEXFILE.1= scanner.1 
YACCFILE.y= parser.y 

ULIBS= 

SLIBS= 

PROGRAM= a2z 

LFLAGS= 

YFLAGS= 

CFLAGS= -O 
CPPFLAGS= 

LDFLAGS= 

LINTFLAGS= 

LEXFILE.C= $(LEXFILE.1:.1=.c) 

YACCFILE.c= $(YACCFILE.y:.y=.C) 

YACCFILE . h=! $ (YACCFILE . y: . y=. h) 

SOURCES= $(SOURCES.C) $(LEXFILE.c) $(YACCFILE.c) 

OBJECTS= $(SOURCES:.c=.o) 

LINTFILES= $(SOURCES:.c=.In) 

.KEEP_STATE: 

all debug profile: $(PROGRAM) 

debug := CFLAGS= -g 
profile := CFLAGS= -pg -0 

$(PROGRAM): $(OBJECTS) $(ULIBS) 

$(LINK.c) -o $0 $(OBJECTS) $(ULIBS) $(SLIBS) 

$(LEXFILE.c): $(YACCFILE.h) 

$(YACCFILE.c) + $(YACCFILE.h): $(YACCFILE.y) 

$(YACC.y) -d $(YACCFILE.y) 
mv y.tab.c $(YACCFILE.c) 
mv y.tab.h $(YACCFILE.h) 

$(ULIBS): FORCE 

cd $(@D); $(MAKE) $(@F) 

FORCE: 

lint: $(LINTFILES) 

$(LINTFILES): 

$(LINT.c) $(LINTFILES) 

clean: 

rm -f $(PROGRAM) $(OBJECTS) $(LINTFILES) 
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r .—-----—---——--- ---——--•■■■ ‘ —.......— 

tutorial% make all 
cC ^0 '-surL4 -’C C_functions-c 
yacc -d parser.y 
mv y.tab.c par$et,c 
• mv y*tab,h parser.h 
rm '-'f scanner,Q 
lex “t scanner41 > scanner 
CC --0 -sun4 --c scanner.c 
CC .-O '-sun4 “G parser *c 

cc -Q -$un4 -o a2z c_function s,q eoanner.o parser,a 

V_ ’ _> 


Maintaining Shell Scripts with Although a shell script is a plain text file, it must be executable in order to run. 

make and SCCS Since SCCS removes execute permission for files under its control and a shell 

script must have execute permission in order to run, a distinction must be drawn 
between a shell script and it’s “source” file under SCCS control, make has an 
implicit rule for deriving a script from its “source” file under SCCS. The suffix 
for a shell script source file is . sh. Even though the contents of the script and 
the . sh file are the same, the script has execute permissions, while the . sh file 
does not. make’s implicit mle for scripts “derives” the script from its source 
file, making a copy of the . sh file (extracting it first, if necessary) and changing 
the mode of the resulting script file to allow execution. For example: 


/- 


tutorial% file script.sh 
script.sh: ascii text 


tutorial% makei 3Cript 


cat script 4sh > script 


chmod +x script 


tutorial^ file script 


script: ccmmands text 





Running Tests with make Shell scripts often come in handy for mnning tests, and performing other routine 

tasks that are either interactive, or don’t require make’s dependency checking. 
Test suites, in particular, often entail providing a program with specific, repeat- 
able input that a program might expect to receive from a terminal. 

In the case of a library, a set of programs that exercise its various functions may 
be written in C, and dien executed in a specific order, with specific inputs from a 
script. In the case of a utility program, there may be a set of benchmark pro¬ 
grams that exercise and time its functions. In each of these cases, the commands 
to run each test can be incorporated into a shell script for repeatability and easy 
maintenance. 

Once you have developed a test script that suits your needs, including a target to 
run it is easy. Although make’s dependency checking may not be needed within 
the script itself, you can use it to make sure that the program or library is updated 
before running those tests. 
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u 


test: all testscript 

set -X ; testscript > /tmp/test.$$$$ 

testscript: testscript.sh test_l test_2 test_3 

test_l test_2 test_3: $$@.c $(LIBRARY) 

$(LINK.c) -o $0 $< $(LIBRARY) $(SLIBS) 


In the following target entry for running tests, test depends on the library 
named as a dependency to all. If the library is out of date, make rebuilds it and 
proceeds with the test. This insures that you always test with an up to date ver¬ 
sion: 


test also depends on testscript, which in turn depends on the three test 
programs. This assures that they too are up to date before make initiates the test 
procedure, all is built according to its target entry in the makefile; 
testscript is built using the . sh implicit rule; and the test programs are 
built using the rule in the last target entry, assuming that there is just one source 
file for each test program. (The . c implicit mle doesn’t apply to these programs, 
because they must link with the proper libraries in addition to their respective . c 
files). 



Delayed References to a Shell 
Variable 


The string $$$$, in the rule for test is, in fact, a pair of references to make’s 
$ macro (each written as $$). make resolves each such reference into a single 
$, and the command line is passed to the shell as: 

set -X ; testscript > /usr/tmp/test.$$ 

In this way, the variable reference is delayed from final expansion imtil it reaches 
the shell, which interprets it as a reference to $ $, the value of which is the pro¬ 
cess number of the shell. This number is appended to the output filename so that 
the results of each successive test is written to a unique filename with a standard 
format. The set -x command forces the shell to display the command on the 
terminal. This allows you to see the actual filename containing the test results. 


This makefile produces: 


-—- 

tutorial make 

cp testscript.sh testscript 
chmod +x testscript 
cc -sun4 --o test__l test__l.c 

cc -sun4 -o test_2 test_2.c 

-^ 

cc -sun4 -0 test__3 test_3.c 
testscript > /tmp/test.$$ 


+ testscript > /tmp/test.26500 



.. J 
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A more flexible set of entries for testing a library looks like: 


TESTSCRIPT= testscript 
TESTPROGS= test_l test_2 test_3 

test: all $(TESTSCRIPT) 

set -X ; $(TESTSCRIPT) > /tmp/test.$$$$ 

$(TESTSCRIPT): $$@.sh $(TESTPROGS) 


$(TESTPROGS): $$0.0 $(LIBRARY) 

$(LINK.c) -O $0 $< $(LIBRARY) $(SLIBS) 


In the case of a program, testing routines written in C may not be necessary; 
leaving TESTPROGS undefined will mean the target entry for test programs is 
omitted from the dependency scan. TESTSCRIPT depends only upon its 
corresponding . sh file. If there are test programs that don’t depend on a library 
(the LIBRARY macro is undefined) this method is still applicable; it is the 
equivalent of the . c implicit mle. If, there is a test program that depends on the 
same libraries as the program does, you can either replace references to the 
LIBRARY macro with references to ULIBS: 

-—— - \ 

$(TESTPROGS): $$0.c $(ULIBS) 

$(LINK.c) -o $0 $< $(ULIBS) $(SLIBS) 
___^ 


make is especially useful when a software project consists of a system of pro¬ 
grams and libraries. By taking advantage of nested make commands, you can 
use it to maintain object files, executables, and libraries in a whole hierarchy of 
directories. You can use make in conjunction with SCCS, to assure that sources 
are maintained in a controlled manner, and that programs built from them are 
consistent. This means that you can provide other programmers with duplicates 
of the directory hierarchy for simultaneous development and testing if you wish 
(although there are tradeoffs to consider). 

You can use make to build the entire project and install final copies of various 
modules onto another filesystem for integration and distribution. 

Organizing A Project for Ease As mentioned earlier, one good way to organize a project is to segregate each 
of Maintenance major piece into its own directory. A project broken out this way usually resides 

within a single file-system or directory hierarchy. Header files could reside in 
one subdirectory, libraries in another, and programs in still another. Documenta¬ 
tion, such as Reference Pages, may also be kept on hand in another subdirectory. 
Suppose that a project is composed of one executable program, one library that 
you supply, a set of header files for the library routines, and some documentation, 
stmctured as shown. 


8.5. Maintaining Software 
Projects 
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u 


Using include Makefiles 



The makefiles in each subdirectory can be borrowed from examples in earlier 
sections, but something more is needed to manage the project as a whole. A 
carefully structured makefile in the root directory, the root makefile for the pro¬ 
ject, provides target entries for managing the project as a single entity. 

As a project grows, the need for consistent, easy-to-use makefiles also grows. 
Macros and target names should have the same meanings no matter which 
makefile you are reading. Conditional macro definitions and compilation options 
for output variants should be consistent across the entire project. 

Where feasible, a template approach to writing makefiles makes sense. This 
makes it easy for you keep track of how the project gets built. All you have to do 
to add a new type of module is to make a new directory for it, copy an appropri¬ 
ate makefile into that directory, and make a few minor edits to change macro 
values. (Of course, you also need to add the new module to the list of things to 
build in the root makefile, but that comes later.) 

Although a makefile should document exactly what it builds, it does not neces¬ 
sarily have to contain an explanation of every step. After all, the idea is to spend 
time working on the code, not the makefiles. 

Conventions for macro names, such as those for the various source files in the 
above examples, should be instituted and observed throughout the project. 
Mnemonic macro names mean that although you may not remember the exact 
value of the macro, you’ll know the type of value it represents (and that’s usually 
more valuable when deciphering a makefile anyway). 

One method of simplifying makefiles, while providing a consistent compilation 
environment, is to use make’s 

include filename 

directive to read in the contents of a named makefile; if the named file is not 
present, make checks for a file by that name in /usr/ include/make. ' 

For instance, there is no need to duplicate the pattern-matching rule for process¬ 
ing t rof f sources in each makefile, when you can include it’s target entry, 
as shown below. 
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( ---—-V 

SOURCES= main.c data.c 

clean: $(PROGRAM) $(OBJECTS) $(LINTFILES) 
include ../pm.rules.mk 

V-- J 


Here, make reads in the contents of the . . /pm. rules . mk file, shown here: 

f ---V 

# pm.rules.mk 

# 

# Simple "include” makefile for pattern matching 

# rules. 


tr: 

% .ms 





troff 

-t -ms 

$< 

> $e 

nr: 

% .ms 





nrof f 

-ms $< 

> 

$0 


While it may seem silly to propagate simple rules like these, but the include 
facility does allow you to define rules of any degree of complexity just once, and 
maintain them in just one location. 

Installing Finished Programs When a program is ready to be released for outside testing or general use, you 

and Libraries can use make to install it. Adding a new target and new macro definition to do 


SO IS easy: 

DESTDIR= /proto/project/bin 

■N 

install: $(PROGRAM) 

-mkdir $(DESTDIR) 

cp $(PROGRAM) $(DESTDIR) 


A similar target entry can be used for installing a library under the macro naming 
scheme used in this manual: 

r 

DESTDIR= /proto/project/lib 


install: $(LIBRARY) 

-mlcdir $ (DESTDIR) 

cp $(LIBRARY) $(DESTDIR) 

. 
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A list of header files might appear as: 


/ - --- -—--— - _ 

DESTDIR= /proto/project/include 


install: $(LIST) 


-mkdir $(DESTDIR) 


cp $(LIST) $(DESTDIR) 

—_ J 


Finally, a list of Reference Manual Pages, which are typically distributed in 
source form, are installed just like header files (these may comprise a subset of 
the items in the doc subdirectory). 


Building the Entire Project From time to time it is necessary to take a snapshot of the sources, and the object 

files that they produce. This can either be done as a checkpoint in the develop¬ 
ment process, or as an intermediate or final build for release to users. Building 
an entire project is simply a matter of invoking make successively in each sub¬ 
directory to build and install each module. 

Subsequent examples show how to incorporate these make commands in the 
root makefile, which should also allow you to build debugging and profiling vari¬ 
ants of the project, clean the directories, and install completed modules. The fol¬ 
lowing example show how to use nested make commands to build a simple pro¬ 
ject. 


# Simple root makefile for a project. 

TARGETS= all debug profile lint clean test install 
SUBDIRS= bin include lib doc 

$(TARGETS): 

$(MAKE) $(SUBDIRS) TARGET==$@ 

$(SUBDIRS): FORCE 

Cd $(MAKE) $(TARGET) 

FORCE: 
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Maintaining Directory 
Hierarchies With Recursive 
Makefiles 


If you extend your project hierarchy to include more layers: 



chances are that not only will the makefile in each intermediate directory have to 
produce target files, but it will also have to invoke nested make commands for 
subdirectories of its own. Files in the current directory can sometimes depend on 
files in subdirectories, and their target entries need to depend on their coimter- 
parts in the subdirectories. 

This means that the nested make command for each subdirectory should run 
before the command in the local directory does. One way to assure that the com¬ 
mands run in the proper order is ,to make a separate entry for the nested part, and 
another for the local part. If you add these new targets to the dependency list for 
the original target, its action will encompass them both. 

Targets that encompass equivalent actions in both the local directory and in sub¬ 
directories are referred to as recursive targets.^® A makefile with recursive targets 
is referred to as a recursive makefile. 

In the case of all, the the nested dependency can be named all. nested; the 
local dependency, all. local. Note that this example conditionally defines 
the TARGET macro, rather than using $ @, to pass the proper argument to the 
make command in what is now the all. nested dependency. 
-\ 

all := TARGET all 
all: all. nested all.local 
all. nested: 

$(MAKE) $(SUBDIRS) TARGET=$(TARGET) 

$(SUEDIRS): FORCE 

cd $0; $(MAKE) $(TARGET) 

all.local: $(PROGRAM) 

_ 


^ Strictly speaking, any target that calls make, with its name as an argument, is recursive. However, here 
the term is reserved for the narrower case of targets that have both nested and local actions. Targets that only 
have nested actions are referred to as “nested” targets. 
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Note that the “nested” target invokes make with the all target as an argument, 
wot all .nested. The nested make must also be recursive, unless it is at the 
bottom of the hierarchy. Either way, it should be invoked with the same name as 
that used in the parent directory. In the makefile for a leaf directory (one with no 
subdirectories to descend into), you can simply comment out the rule for the 
nested target, which will halt any further descent. 

Recursive install Targets This same principle can be extended to all of the generic targets. The install 

target, however, is something of a special case. If the destination is a parallel 
directory hierarchy (such as when you are installing completed source code), the 
parent directories must be created before the destination subdirectories can be. 
This often means that the make install target in the current directory (which 
creates the destination directory if needed) must be performed before that in any 
subdirectory can succeed. So, install. local must appear ahead of 
install. nested in the dependency list for install.^^ 

This next example shows a recursive makefile in a directory with a C program 
and subdirectories. 

Figure 8-17 Recursive Makefile for Building a C Program and Subdirectories 


f @(#) sample.9.mk 
# 

# Recursive makefile for a C program and subdirectories. 

# Also includes test and install targets. 

SOURCES= main.c data.c 
ULIBS= ../lib/libpkg.a 
SLIBS= —Icurses -Itermlib 
PROGRAM= program 

SUBDIRS= sun2 sun3 sun4 
TESTSCRIPT= testscript 
TESTPROGS= test_l test_2 test_3 
DESTDIR= /proto/project/bin 

CFLAGS= -0 
CPPFLAGS= 

LDFLAGS= 

LINTFLAGS= 

TARGETS= all debug profile lint clean test 
TARGETS.nested= $(TARGETS:%=%.nested) 

TARGETS.local= $(TARGETS:%=%.local) 

0BJECTS= $(SOURCES:.c=.o) 

LINTFILES= $(SOURCES:.c=.In) 

.KEEP STATE: 


If the local target depends on files within a subdirectory, this may 
force make to descend into that subdirectory twice during a make install mn. 


^ sun 
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debug := CFLAGS= -g 
profile := CFLAGS= -pg -O 
debug.local := CFLAGS= -g 
prof lie. local := CFIiAGS= -pg -O 

# Recursive targets: 

all := TARGET = all 
debug := TARGET = debug 
profile := TARGET = profile 
lint := TARGET = lint 
clean := TARGET = clean 
test := TARGET = test 
Install := TARGET = Install 

$(TARGETS): $$@.nested $$@.local 
Install: $$@.local $$@.nested 

# Nested targets: 

$(TARGETS.nested) Install.nested: 

$(MAKE) $(SUBDIRS) TARGET=$(TARGET) 

$(SUEDIRS): FORCE 

cd $@ ; $(MAKE) $(TARGET) 

# Local target entries: 

all.local debug.local proflie.local: $(PROGRAM) 

$(PROGRAM): $(OBJECTS) $(ULIBS) 

$(LINK.c) -O $e $(OBJECTS) $(ULIBS) $(SLIBS) 

$(ULIBS): FORCE 

cd SOD) ; $(MAKE) $(@F) 

FORCE: 

lint.local: $(LINTFILES) 

$(LINT.c) $(LINTFILES) 

clean.local: 

rm -f $(PROGRAM) $(OBJECTS) $(LINTFILES) $(TESTSCRIPT) $(TESTPROGS) 

test.local: all $(TESTSCRIPT) 

set -X ; $(TESTSCRIPT) > /tmp/test.$$$$ 

$(TESTSCRIPT): $(TESTSCRIPT).sh $(TESTPROGS) 

$(TESTPROGS): $$@.c $(ULIBS) 

$(LINK.c) -O $@ $< $(ULIBS) $(SLIBS) 

Install.local: $(PROGRAM) 

-mkdir $(DESTDIR) 

-cp $(PROGRAM) $(DESTDIR) 


Notice that you can still use make to build a local target, simply by appending 
the . local suffix to the target name that you’re used to. The command 

make all.local 

does exactly what you’d expect. However, we recommend against making a 
habit of this practice, especially where local targets rely on modules in nested 
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targets. If the files in the subdirectories are up to date, it doesn’t take very long 
for make to check them. If they aren’t up to date, and you build the local target 
without a full dependency check, there is a strong possibility that the target file 
you produce will be inconsistent with those lower-level files, at least until it is 
clean’ed and remade. 


Maintaining A Large Library as When maintaining a very large library, it is sometimes easier to break it up into 

a Hierarchy of Subsidiaries smaller, subsidiary libraries, and use make to combine them into a complete 

package. Although you cannot combine libraries directly with ar, you can 
extract the member files from each subsidiary library, and then archive those files 
in another step: 


tutorial% ar xv libx.a 
X - xl.o 
X - x2.o 
X - x3.o 

tutorial% ar xv liby.a. 

X - yl,o 
X - y2.o 

tutorlal% ar rv libx.a *.o 
a - xl.o 
a - x2,o 
a - x3.o 
a - yl.o 
a - y2.o 

ar: creating libz.a 


A subsidiary library is maintained using a makefile in its own directory, along 
with the (object) files it is built from. The makefile for the complete library typi¬ 
cally makes a symbolic link to each subsidiary archive, extracts their contents 
into a temporary subdirectory, and archives the resulting files to form the com¬ 
plete package. 


In general, use of shell filename 
wildcards is considered to be bad 
form in a makefile. If you do use it, 
you need to take steps to insure 
that it excludes spurious files, 
perhaps by isolating affected files in 
a temporary subdirectory. 


The next example updates the subsidiary libraries, creates a temporary directory 
in which to extracted the files, and extracts them. It uses the * (shell) wild card 
within that temporary directory to generate the collated list of files. While 
filename substitutions are generally frowned upon, this use of the wild card is 
acceptable because the directory is created afresh whenever the target is built. 
This guarantees that it will contain only files extracted during the current make 
run. 

The example relies on a naming convention for directories. The name of the 
directory is taken from the basename of the library it contains. For instance, if 
libx. a is a subsidiary library, the directory that contains it is named libx. It 
makes use of suffix replacements in dynamic-macro references to derive the 
directory name for each specific subdirectory. (You can verify yourself that this 
is necessary.) 

It uses a shell for loop to successively extract each library, and a shell com¬ 
mand substitution to collate the object files into proper sequence for li nkin g 
(using lorder and tsort) as it archives them into the package. Finally, it 
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removes the temporary directory and its contents. 

- —-—-— > 

# simple makefile for collating a library from 

# subsidiaries. 

LIBRARY= libz.a 

LIBS= libx.a liby.a 

ARFLAGS= 

CFLAGS= -0 

CPPFLAGS= 

.KEEP_STATE: 

.PRECIOUS: libz.a 

all: $(LIBRARY) 

$(LIBRARY): $(LIBS) 

-rm -rf tmp 
-mkdir tmp 

set -X ; for i in $(LIBS) ; \ 

do ( cd tmp ; ar x ../$$i ) ; done 

( cd tmp ; rm -f _.SYMDEF ; ar cr ../$0 'lorder * | tsort' ) 

-ranlib $@ 

-rm -rf tmp $(LIBS) 

$(LIBS): FORCE 

-cd$(@:.a=) ; $(MAKE) $0 

-In -s $(0:.a=)/$0 $0 

FORCE: 

'- 



Tor the sake of clarity, this example omits support for alternate variants, as well 
as the targets for clean, install, and test (lint does not apply since the 
source files are in the subdirectories). This material is added in later examples. 

Therm -f __. SYMDEF command embedded in the collating line prevents a 

symbol table in a subsidiary (produced by running ranlib on that library) from 
being archived in this library. 

Since the nested make commands build the subsidiary libraries before the 
currently library is processed, it is a simple matter to extend this makefile to 
account for libraries built from both subsidiaries and object files in the current 
directory. You need only add the list of object files to the dependency list for the 
library, and a command to copy them into the temporary subdirectory for colla¬ 
tion with object files extracted from subsidiary libraries. 
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# Simple makefile fox collating a library from 

# subsidiaries and local object files. 

LIBRARY= libz.a 

LIBS= libx.a liby.a 

SOT7RCES= map.o calc.o draw.o 

ULIBS= $(LIBRARY) 

ARFLAGS= 

CFLAGS= -0 

CPPFLAGS= 

0BJECTS= $(SOURCES.c=.o) 

.KEEP_STATE: 

.PRECIOUS: libz.a 

all: $(LIBRARY) 

$(LIBRARY): $(LIBS) $(OBJECTS) 

-rm -rf tmp 

-mkdir tmp 

-cp $ (OBJECTS) tirp 

set -X ; for i in $(LIBS) ; \ 

do ( cd tmp ; ar x ../$$i ) ; done 

( cd tmp ; rm -f _.SYMDEF ; ar cr ../$@ 'lorder * | tsort' ) 

-ranlib $@ 

-rm -rf tmp $(LIBS) 

$(LIBS): FORCE 

-cd $(G:.a=) ; $(MAKE) $@ 

-In -s $(@:.a=)/$@ $@ 

FORCE: 


The next example includes support for debugging and profiling variants, along 
with recursive targets for clean, lint, test, and install. 
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Figure 8-18 Makefile for a Hierarchy of Subsidiary Libraries with Variants 

- —^^ 

# @(#) sample.10.mk 

# 

# Makefile for collating a library from local object files and 

# subsidiary libraries. Supports alternate variants, and maintains 

# subdirectories recursively. 

LIBRARY= libz.a 
LIBS= libx.a liby.a 
SOURCES^ map.c calc.c draw.c 
ULIBS= $(LIBRARY) 

SLIBS= -Icurses -Itermlib 

SUBDIRS= $(LIBS:.a=) 

TESTSCRIPT= testscript 
TESTPROGS= test_l test_2 test_3 
DESTDIR= /proto/project/lib 

ARFLAGS= 

CFLAGS- -0 
CPPFLAGS- 
LDFLAGS- 
LINTFLAGS= 

TARGETS= lint clean test 

TARGETS.nested=$(TARGETS:%=%.nested) 

TARGETS.local=$(TARGETS:%=%.local) 

0BJECTS= $(SOURCES.c:.c=.o) 

LINTFILES= $(SOURCES.c: .C-. In) 

.KEEP_STATE: 

.PRECIOUS: libz.a 

all profile debug: $(LIBRARY) 

debug := CFLAGS= -g 
profile := CFLAGS= -O -pg 

debug := TARGET= debug 
profile := TARGET= profile 

$(LIBRARY): $(LIBS) $(OBJECTS) 

-rm -rf tmp 
-mkdir tmp 

-cp $(OBJECTS) tmp 
-set -X ; for i in $(LIBS) / \ 

do ( cd tmp ; ar x ../$$i ) ; done 

( cd tmp ; rm -f _.SYMDEF ; ar cr ../$@ 'lorder * | tsort' ) 

-ranlib $@ 

-rm -rf tmp $(LIBS) 

$(LIBS): FORCE 

-cd $(0:.a=) ; $(MAKE) $(TARGET) 

-In -s $(@:.a=)/$@ $@ 

FORCE: 
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# Recursive targets: ~ ---—--\ 

lint := TARGET = lint 
clean := TARGET = clean 
test := TARGET = test 
install := TARGET = install 

lint clean test: $$@.nested $$@.local 

install: $$@.local $$@.nested 

# Nested targets: 

$(TARGETS.nested) install.nested: 

$(MAKE) $(SUBDIRS) TARGET=$(TARGET) 

$(SUEDIRS): FORCE 

cd $0 ; $(MAKE) $(TARGET) 

# Local target entries: 

lint.local: $(LINTFILES) 

$(LINT.c) $(LINTFILES) 

clean.local: 

rm -f $(LIBRARY) $(OBJECTS) $(LINTFILES) $(TESTSCRIPT) $(TESTPROGS) 

test.local: all $(TESTSCRIPT) 

set -X ; $(TESTSCRIPT) > /tmp/test.$$$$ 

$(TESTSCRIPT): $(TESTSCRIPT).sh $(TESTPROGS) 

$(TESTPROGS): $$e.c $(ULIBS) 

$(LINK.c) -o $0 $< $(ULIBS) $(SLIBS) 

install.local: $(LIBRARY) 

-mkdir $(DESTDIR) 

-cp $(PROGRAM) $(DESTDIR) 


Closing Remarks about make make has evolved into a powerful and flexible tool for consistently processing 

files that stand in a hierarchical relationship to one another. The methods and 
examples shown in this manual are intended to provide you with an exposure to 
the kinds of problems that lend themselves to solution with make. There is a 
large body of folklore about make; strong and varied opinions about its “best” 
use abound. This manual does not make the claim that any one approach or 
example is necessarily the best available. Compromises between clarity and 
functionality were made in many of the examples. 

Also, there is considerable opinion both pro and against makefiles that use mac¬ 
ros extensively. Some experts prefer to tailor makefiles for specific situations. 
Others prefer that all makefiles look the same and work the same way. 

This manual takes the latter approach. The examples are intended to be useful, 
just as they are, in a wide variety of not-too-complicated settings. As procedures 
become niore complicated, so do the makefiles that implement them. The trick is 
to know which approach will yield a reasonable makefile that works in a given 
situation. The examples are intended to give you a flavor for common situations, 
and some fairly straightforward methods to simplify them using make. 
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If a template approach is used in a project from the outset, chances are that cus¬ 
tom makefiles that evolve from the templates will be more familiar, and therefore 
easier to understand, to integrate, to maintain, and more importantly, to re-use. 
After all, the less time you spend tinkering with the makefiles, the more time you 
have to develop your program or project. 
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m4 — a Macro Processor 


m4 is a macro processor whose primary use has been as a front end for Ratfor in 
those cases where parameterless macros are not powerful enough. It has also 
been used for languages as disparate as C and COBOL. m4 is particularly suited 
for higher-level languages like FORTRAN, PL/I and C since macros are specified 
in a functional notation. 

m4 provides features seldom found even in much larger macro processors, 
including 

□ arguments 

□ condition testing 

□ arithmetic capabilities 

n string and substring functions 

□ file manipulation 

A macro processor is a useful way to enhance a programming language, to maVp 
it more palatable or more readable, or to tailor it to a particular application. The 
#def ine statement in C and the analogous define in Ratfor are examples of 
the basic facility provided by any macro processor, that is, replacement of text by 
other text 

The basic operation of m4 is to act as a filter between its input and its output. As 
the input is read, each alphanumeric “token” (that is, string of letters and digits) 
is checked. If it is the name of a macro, then it macro is replaced by the text that 
has been assigned to it {defining text), and the resulting string is pushed back 
onto the input to be rescanned. Macros may be called with arguments, in which 
case the arguments are collected and substituted into the right places in the text 
before it is rescanned. 

m4 provides a collection of about twenty built-in macros which perform various 
useful operations; in addition, the user can define new macros. Built-in macros 
and user-defined macros work exactly the same way, except that some of the 
built-in macros have side effects on the state of the process. 
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9.1. Using the m4 
Command 


The basic m4 command line looks like this; 
m4 [filename ... ] 

_ 


"Each, argument file is processed in Order; if there are no arguments, or if an argu¬ 
ment is the standard input is read at that point. The processed text is written 
to the standard output, which may be captured for subsequent processing using 
redirection: 

-^ 

m4 [filename ...} > outputfile 

^ ___ > 


9.2. Defining Macros The primary built-in function of m4 is define, which is used to define new 

macros. The input 

define ( name , value ) 

defines the string name as value. All subsequent occurrences of name will be 
replaced by value, unless name is redefined, or its definition is removed. Note 
that name must be alphanumeric, and must begin with a letter, the underscore 
character, _ is taken as a letter. The value argument is any text that contains bal¬ 
anced parentheses; it may stretch over multiple lines. 


Thus, as a typical example might be: 


/ - 


define(N, 100) 


if (i > N) 





defines N to be 100, and uses this “symbolic constant” in a later if statement. 

The left parenthesis must immediately follow the word define, to signal that 
define has arguments. If a macro or built-in name is not followed immediately 
by ‘(’, it is assumed to have no arguments. This is the situation for N above; it is 
actually a macro with no arguments, and thus when it is used there need be no 
parenthesis following it. 

m4 divides its input into tokens, so a macro name is only recognized as such if it 
appears surrounded by non-alphanumeiics. For example, in 


( - 


define(N, 100) 


if (NNN > 100) 

- 

V- 

. . J 


the variable NNN is absolutely unrelated to the defined macro N, even though it 
contains several N’s. 
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defines both M and N to be 10 0. 


What happens if N is redefined? Or, to say it another way, is M defined as N or as 
100? In m4, the latter is true. M is translated to 10 0 as is is scanned, so chang¬ 
ing N does not change M. 

This behavior arises because m4 expands macro names into their defining text 
immediately. Here, that means that when the string N is seen while the argu¬ 
ments of define are being collected, it is immediately replaced by 100; it’s 
just as if you had said 

define(M, 100) 

in the first place. 


If this isn’t what you really want, there are two alternatives. The first, which is 
specific to this situation, is to interchange the order of the definitions: 



Now M is defined to be the string N, so when you ask for M later, you’ll always 
get the value of N at that time (because the M will be replaced by N which will be 
replaced in turn by its value). 


9.3. Quoting and The more general solution is to delay the expansion of the arguments of define 

Comments by quoting them. Any text enclosed within the single-quote marks ' and ^ is 

not expanded immediately, but merely has the quotes stripped off. If you say 



the quotes around the N are stripped off as the argument is being collected, but 
they have served their purpose, and M is defined as the string N, rather than the 
value of the N macro. 


The general mle is that m4 always strips off one level of single quotes whenever 
it evaluates something. This is tme even outside of macros. If you want the 
word define to appear in the output, you have to quote it in the input, as in 

'define' = 1; 
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As another instance of the same thing, which is a bit more surprising, consider 
redefining N: 

- ^ ^ 

define(N, 100) 

define (N,- 200) 

___ J 



Perhaps regrettably, the N in the second definition is evaluated as soon as it’s 
seen; that is, it is replaced by 10 0, so it’s as if you had written 

define(100, 200) 


While this statement is ignored by m4, since you can only define macros with 
names that start with an alphabetical character or underscore, it obviously doesn’t 
have the effect you wanted. To redefine N, you must delay the evaluation by 
quoting it: 


r "— ..... 

\ 

define(N, 100) 


define('N', 200) 


J 


If the ' and ' characters are not convenient for some reason, the quote and end- 
quote characters can be changed with the built-in changequote function. For 
instance: 

changequote([, ]) 


n 


the left and right brackets the new quote and end-quote characters. You can 
restore the original characters with just changequote. There are two addi¬ 
tional built-ins related to define, undefine removes the definition of some 
macro or built-in: 


undefine('N') 

removes the definition ofN. (Why are the quotes absolutely necessary?) Built- 
ins can be removed with undefine, as in 

undefine('define') 

but once you remove one, you can never get it back. 

The built-in if def provides a way to determine if a macro is currently defined. 
In particular, m4 pre-defines the name unix. 

if def actually permits three arguments; if the name is undefined, the value of 
if def is then the third argument, as in 

ifdef( 'Unix' , on SunOS, not on SunOS) 

Don’t forget the quotes around the argument. 

Comments in m4 are introduced by the # (sharp) character. All text from the # 
to the end of the line is taken as a comment and otherwise ignored. 
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9.4. Macros with 
Arguments 


9.5. Arithmetic Built-ins 


So far we have discussed the simplest form of macro processing — replacing one 
string by another (fixed) string. User-defined macros may also have arguments, 
so different invocations can have different results. Within the replacement text 
for a macro (the second argument of its def ine) any occurrence of$n is 
replaced by the nth argument when the macro is actually used. Thus, the macro 
bump, defined as 

define(bump, $1 = $1 + 1) 

generates code to increment its argument by 1; 

bump(x) 

evaluates to 

X - X + 1 

A macro can have as many arguments as you want, but only the first nine are 
accessible, through $1 to $9. The macro name itself is $0, although that is less 
commonly used. Arguments that are not supplied are replaced by null strings, so 
we can define a macro cat which simply concatenates its arguments, like this: 

define(cat, $1$2$3$4$5$6$7$8$9) 

Thus 

cat(x, y, z) 
is equivalent to 
xyz 

$4 through $9 are null, since no corresponding arguments were provided. 

Leading unquoted [SPACB’s, CEB’s, or I NEWr JNR I ’s that occur during argu¬ 
ment collection are discarded. All other white space is retained. Thus 

define(a, b c) 

defines a to be‘b c’. 

Arguments are separated by commas, but commas can be nested inside 
parentheses. That is, in 

define (a, (b,c)) 

there are only two arguments; the second is literally (b,c). And of course a bare 
comma or parenthesis can be inserted by quoting it 

m4 provides two built-in functions for doing arithmetic on integers (only). The 
simplest is incr, which increments its numeric argument by 1. Thus to handle 
the common programming situation where you want a variable to be defined as 
“one more than N’’, write 


f --- 


define(N, 100) 


define(Nl, 'incr(N)') 


V 

J 


which defines N1 as one more than the current value of N. 
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The more general mechanism for arithmetic is a built-in called eval, which is 
capable of arbitrary arithmetic on integers, eval provides the operators (in 
decreasing order of precedence), as shown in the table below. 


Table 9-1 Operators to the eval built in in m4 


Operator 

Meaning 

unary -f- and - 

add and subtract 

★ ★ Qr ^ 

exponentiation 

* / % 

multiply, divide, and modulus 

+ - 

binary add and subtract 

11 

A 

A 

11 

V 

V 

11 

11 

11 

equal, not equal, less than, less than or equal, 
greater than, greater than or equal 

! 

logical not 

& or & & 

logical and) 

\ or W 

(logical or) 


Parentheses may be used to group operations where needed. All the operands of 
an expression given to eval must ultimately be numeric. The numeric value of 
a tme relation (like 1>0) is 1, and false is 0. The precision in eval is 32 bits. 

As a simple example, suppose we want M to be 2 * *N-i-l. Then 
- 

define(N, 3) 

define(M, 'eval(2**N+1)') 

_ t 


As a matter of principle, it is advisable to quote the defining text for a macro 
unless it is very simple indeed (say, just a number); it usually gives the result you 
want, and is a good habit to get into. 

9.6. File Manipulation You can include a new file in the input at any time by the built-in function 

include: 

include(filename) 

inserts the contents of filename in place of the include command. The con¬ 
tents of the file is often a set of definitions. The value of include (that is, its 
replacement text) is the contents of the file; this can be captured in definitions, 
etc. 

It is a fatal error if the file named in include caimot be accessed. To get some 
control over this, the alternate form sinclude can be used; sinclude 
(“silent include”) says nothing and continues if it can’t access the file. 
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9.7. Running SunOS 
Commands 


9.8. Conditionals 


It is also possible to divert the output of in4 to temporary files during processing, 
and output the collected material upon command. m4 maintains nine of these 
diversions, numbered 1 through 9. If you say 

divert(n) 

all subsequent output is put onto the end of a temporary file referred to as n. 
Diverting to this file is stopped by another divert command; in particular, 
divert or divert (0) resumes the normal output process. 

Diverted text is normally output all at once at the end of processing, with the 
diversions output in numeric order. It is possible, however, to bring back diver¬ 
sions at any time, that is, to append them to the current diversion. 

undivert 

brings back all diversions in numeric order, and undivert with arguments 
brings back the selected diversions in the order given. The act of undiverting dis¬ 
cards the diverted stuff, as does diverting into a diversion whose number is not 
between 0 and 9 inclusive. 

The value of undivert is not the diverted text Furthermore, the diverted 
material is not rescarmed for macros. 

The built-in divnum returns the number of the currently active diversion. This 
is zero during normal processing. 

You can run any SunOS command using the syscmd built-in. For example, 
syscmd(date) 

runs the date command. Normally syscmd would be used to create a file for a 
subsequent include. 

To facilitate making unique file names, the built-in maketemp is provided, with 
specifications identical to the system function mktemp: a string of XXXXX in the 
argument is replaced by the process ID (pid) of the current process. 

There is a built-in called if else which enables you to perform arbitrary condi¬ 
tional testing. In its simplest form, 

ifelse(a, b, c, d) 

compares the two strings a and b. If these are identical, if else returns the 
string c; otherwise it returns d. Thus we might define a macro called compare 
which compares two strings and returns “yes” or “no” according to whether 
they are the same or different. 

define(compare, 'ifelse($l, $2, yes, no)') 

Note the quotes, which prevent too-early evaluation of if else. 

If the fourth argument is missing, it is treated as empty. 



Revision A of 9 May 1988 



200 Programming Utilities and Libraries 


if else can actually have any number of arguments, and thus provides a limited 
form of multi-way decision capability. In the input 

ifelse(a, b, c, d, e, f, g) 

if the string a matches the string b, the result is c. Otherwise, if d is the same as 
e, the result is f. Otherwise the result is g. If the final argument is omitted, the 
result is null, so 

ifelse(a, b, c) 

is c if a matches b, and null otherwise. 

9.9. String Manipulation The built-in len returns the length of the string that makes up its argument. 

Thus 


len(abodef) 

is 6, and len ((a,b)) is 5. 

The built-in substr can be used to produce substrings of strings, 
subs t r (s, i, n) returns the substring of s that starts at the ith position 
(origin zero), and is n characters long. If n is omitted, the rest of the string is 
returned, so 

substr('now is the time', 1) 
evaluates to 

ow is the time 

If either i or n is out of range, various sensible things happen. 

index (si, s2) returns the index (position) in si where the string s2 occurs, 
or -1 if it doesn’t occur. As with substr, the origin for strings is 0. 

The built-in translit performs character transliteration. 

translit(s, f, t) 

modifies s by replacing any character found in f by the corresponding character 
int. That is, 

translit(s, aeiou, 12345) 

replaces the vowels by the corresponding digits. If t is shorter than f, characters 
which don’t have an entry in t are deleted; as a limiting case, if t is not present 
at all, characters in f are deleted from s. So 

translit(s, aeiou) 

deletes vowels from s. 
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There is also a built-in called dnl which deletes all charactere that follow it up 
to and including the next newline; it is useful mainly for throwing away empty 
lines that otherwise tend to clutter up in4 output. For example, if you say 


f - — 

define(N, 

100) 


define(M, 

200) 


define(L^ 

300) 




---- > 


the newline at the end of each line is not part of the definition, so it is copied into 
the output, where it may not be wanted. If you add dnl to each of these lines, 
the newlines will disappear. 


Another way to achieve this^^ is: 


✓-— - — -- 

divert(-1) 


define(. . .) 


divert 



_ > 


9.10. Printing 


U 


The built-in errprint writes its arguments to the standard error file. Thus you 
can say 

errprint('fatal error') 

dumpdef is a debugging aid which dumps the current definitions of defined 
terms. If there are no arguments, you get everything; otherwise you get the ones 
you name as arguments. Don’t forget to quote the names! 


9.11. Summary of Built-in 
m4 Macros 


Table 9-2 Summary of Built-in m4 Macros 


Built In 

Description 

changequote(L, R) 

Change left quote to L, right 
quote to R 

define {name, replacement) 

define name as replacement 

divert {number) 

Divert output to stream number 

divnum 

Return number of currently 
active diversions 

dnl 

Delete up to and including new- 
line 


Thanks to J. E. Weythman. 
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Table 9-2 Summary of Built-in m4 Macros — Continued 


Built In 

Description 

dumpdef 'name', . . .) 

Dump specified definitions 

errprint {s, s, . , . ) 

Write arguments s to standard 
error 

eval {numeric expression) 

Evaluate numeric expression 

i f de f (' name', true string, false string) 

Return true string if name is 
defined,/afac string if name is 
not defined 

ifelse {a, b, c, d) 

If a and b are equal, return c, 
else return d 

include {file) 

Include contents of file 

incr {number) 

Increment number by 1 

index (^7, s2) 

Return position in si where s2 
occurs, or -1 if no occurrence 

len {string) 

Return length of string 

maketemp(. . .XXXXX. . .) 

Make a temporary file 

sinclude (file) 

Include contents of file — 
ignored and continue if file not 
found. 

subs t r {string , position , number) 

Return substring of string start¬ 
ing at position and number char¬ 
acters long 

syscmd {command) 

Run command in the system 

t ran si it {string, from, to) 

Transliterate characters in string 
from the set specified by from to 
the set specified by to 

undefine {' name') 

Remove name from the list of 
definitions 

undivert {number , number , . . . ) 

Append diversion number to the 
current diversion 
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lex — a Lexical Analyzer Generator 


lex is a program generator designed for lexical processing of character input 
streams, lex accepts a high-level, problem-oriented specification for character 
string matching, and produces a program in a general-purpose language which 
recognizes regular expressions. The regular expressions are specified by the pro¬ 
grammer in the source specifications given to lex. The lex written code recog¬ 
nizes these expressions in an input stream and partitions the input stream into 
strings matching the expressions. At the boundaries between strings, program 
sections provided by the programmer are executed. The lex source file associ¬ 
ates the regular expressions and the program fragments. As each expression 
appears in the input to the program written by lex, the corresponding fragment 
is executed. 

The programmer supplies the additional code beyond expression matchin g 
needed to complete his tasks, possibly including code written by other genera¬ 
tors. The program that recognizes the expressions is generated in the general- 
purpose programming language employed for the programmer’s program frag¬ 
ments. Thus, a high-level expression language is provided to write the string 
expressions to be matched while the programmer’s freedom to write actions is 
unimpaired. This avoids forcing the programmer who wishes to use a string 
manipulation language for input analysis to write processing programs in the 
same and often inappropriate string handling language. 

lex source is a table of regular expressions and corresponding program frag¬ 
ments. The table is translated to a program which reads an input stream, copying 
it to an output stream and partitioning the input into strings which match the 
given expressions. As each such string is recognized the corresponding program 
fragment is executed. The recognition of the expressions is performed by a 
deterministic finite automaton generated by lex. The program fragments writ¬ 
ten by the programmer are executed in the order in which the corresponding reg¬ 
ular expressions occur in the input stream. 

The lexical analysis programs written with lex accept ambiguous specifications 
and choose the longest match possible at each input point. If necessary, substan¬ 
tial lookahead is performed on the input, but the input stream is then backed up 
to the end of the current partition, so that the programmer has general freedom to 
manipulate it. 

lex is designed to simplify interfacing with yacc, which is described in the 
next chapter. 
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lex is not a complete language, but rather a generator representing a new 
language feature which can be added to different programming languages, called 
‘host languages.’ Just as general-purpose languages can produce code to run on 
different computer hardware, lex can write code in different host languages. 

The host language is used for the output code generated by lex and also for the 
program fragments added by the programmer. Compatible mn-time libraries for 
the different host languages are also provided. This makes lex adaptable to dif¬ 
ferent environments and different programmer. Each application may be directed 
to the combination of hardware and host language appropriate to the task, the 
programmer’s background, and the properties of local implementations. 

lex turns the programmer’s expressions and actions (called source in this 
document) into the host general-purpose language; the generated program is 
named yylex. The yylex program recognizes expressions in a stream (called 
input in this document) and performs the specified actiorts for each expression 
as it is detected — see Figure 10-1 below. 

Figure 10-1 An overview of ley. 



For a trivial example, consider a program to delete from the input all blanks or 
tabs at the ends of lines. 



is all that is requited. The program contains a %% delimiter to mark the begin¬ 
ning of the rules, and one mle. This mle contains a regular expression which 
matches one or mote instances of the characters blank or tab (written \t for visi¬ 
bility, in accordance with the C convention) just prior to the end of a line. The 
brackets indicate the character class made of blank and tab; the + indicates ‘one 
or more ...’; and the $ indicates ‘end-of-line’. No action is specified, so the pro¬ 
gram generated by lex (yylex) ignores these characters. Everything else is 
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copied to the output stream. To change any remaining string of blanks or tabs to 
a single blank, add another rule: 


✓——-- - 

%% 

N 

[ \t]+$ ; 


[ \t]+ printf(” ”); 



——___/ 


The finite automaton generated for this source scans for both rules at once, 
observing at the termination of the string of blanks or tabs whether or not there is 
a newline character, and executing the desired rule action. The first rule matches 
all strings of blanks or tabs at the ends of lines, and the second rule all remaining 
strings of blanks or tabs. 


lex can also be used with a parser 
generator to perform the lexical 
analysis phase. 


lex can be used alone for simple transformations, or for analysis and statistics 
gathering on a lexical level, lex can also be used with a parser generator to per¬ 
form the lexical analysis phase; it is particularly easy to interface lex and yacc 
lex programs recognize only regular expressions; yacc writes parsers that 
accept a large class of context-free grammars, but require a lower-level analyzer 
to recognize input tokens. Thus, a combination of lex and yacc is often 
appropriate. When used as a preprocessor for a later parser generator, lex is 
used to partition the input stream, and the parser generator assigns structure to 
the resulting pieces. The flow of control in such a case (which might be the first 
half of a compiler, for example) is shown in Figure 10-2. Additional programs, 
written by other generators or by hand, can be added easily to programs written 
by lex. 


Figure 10-2 lex with yacc 



yacc programmers will realize that the name yylex is what yacc expects its 
lexical analyzer to be named, so that the use of this name by lex simplifies 
interfacing. 
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lex generates a deterministic finite automaton from the regular expressions in 
the source. The automaton is interpreted, rather than compiled, in order to save 
space. The result is still a fast analyzer. In particular, the time taken by a lex 
program to recognize and partition an input stream is proportional to the length 
of the input The number of lex rules or the complexity of the rules is not 
important in determining speed, unless rules which include forward context 
require a significant amount of rescanning. What does increase with the number 
and complexity of rules is the size of the finite automaton, and therefore the size 
of the program generated by lex. 

In the program written by lex, the programmer’s fragments (representing the 
actions to be performed as each regular expression is found) are gathered as cases 
of a switch. The automaton interpreter directs the control flow. Opportunity is 
provided for the programmer to insert either declarations or additional statements 
in the routine containing the actions, or to add subroutines outside this action 
routine. 

lex is not limited to source which can be interpreted on the basis of one charac¬ 
ter lookahead. For example, if there are two rules, one looking for ab and 
another for abcdef g, and the input stream is abcdef h, lex recognizes ab 
and leave the input pointer just before "cd..." Such backup is more costly than 
processing simpler languages. 

10.1. lex Source The general format of lex source is: 


f 


{definitions ] 


%% 


{rules} 


%% 


{programmer subroutines } 


V 

. j 


where the definitions and the programmer subroutines are often omitted. The 
second %% is optional, but the first is required to mark the beginning of the rules. 
The absolute minimum lex program is thus 
-- 

%% 

___ / 

(no definitions, no rules) which translates into a program which copies the input 
to the output unchanged. 

In the outline of lex programs shown above, the rules represent the 
programmer’s control decisions; they are a table, in which the left column con¬ 
tains regular expressions (see section 10.2) and the right column contains 
actions, program fragments to be executed when the expressions 

integer printf("found keyword INT"); 

to look for the string integer in the input stream and print the message ‘found 
keyword INT’ whenever it appears. In this example the host procedural language 
is C and the C library function printf () is used to print the string. The end of 
the expression is indicated by the first blank or tab character. If the action is 
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10.2. lex Regular 
Expressions 


Operators 


merely a single C expression, it can just be given on the right side of the line; if it 
is compound, or takes more than a line, it should be enclosed in braces. As a 
slightly more useful example, suppose it is desired to change a number of words 
from British to American spelling, lex rules such as 


colour printf("color"); 
mechanise printf("mechanize"); 

petrol printf("gas"); 


would be a start. These rules are not quite enough, since the word petroleum 
would become gaseum; a way of dealing with this is described later. 

The definitions of regular expressions are very similar to those in the editors 
ex(l) and vi(l). A regular expression specifies a set of strings to be matched. It 
contains text characters (which match (he corresponding characters in the strings 
being compared) and operator characters (which specify repetitions, choices, and 
other features). The letters of the alphabet and the digits are always text charac¬ 
ters; thus the regular expression 

integer 

matches the string integer wherever it appears and the expression 
a57D 

looks for the string a 5 7 D. 


The operator characters are 

"\[]''-?.*+l ()$/{}%<> 

and if they are to be used as text characters, an escape must be used. The quota¬ 
tion mark operator (") indicates that whatever is contained between a pair of 
quotes is to be taken as text characters. Thus 

xyz"++" 

matches the string xyz++ when it appears. Note that a part of a string may be 
quoted. It is harmless but unnecessary to quote an ordinary text character, the 
expression 

"xyz++" 

is the same as the one above. Thus by quoting every non-alphanumeric character 
being used as a text character, the programmer can avoid remembering the list 
above of current operator characters, and is safe should further extensions to lex 
lengthen the list. 

An operator character may also be turned into a text character by preceding it 
with \ as in 

xyz\+\+ 

which is another, less readable, equivalent of the above expressions. Another use 
of the quoting mechanism is to get a blank into an expression; normally, as 
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Character Classes 


Arbitrary Character 


explained above, blanks or tabs end a rule. Any blank character not contained 
within [ ] (see below) must be quoted. Several normal C escapes with \ are 
recognized: \n is newline, \t is tab, and \b is backspace. To enter \ itself, use W. 
Since newline is illegal in an expression, \n must be used; it is not required to 
escape tab and backspace. Every character but blank, tab, newline and the list 
above is always a text character. 


Classes of characters can be specified using the operator pair [ ]. The construc¬ 
tion [ abc ] matches a single character, which may be a, b, or c. Within square 
brackets, most operator meanings are ignored. Only three characters are special: 
\,—, and The - character indicates ranges. For example, 

[a-z0-9<>_] 

indicates the character class containing all the lower case letters, the digits, the 
angle brackets, and underline. Ranges may be given in either order. Using - 
between any pair of characters which are not both upper case letters, both lower 
case letters, or both digits is implementation-dependent and generates a warning 
message. For example, [ 0-z ] in ASCII is many more characters than it is in 
EBCDIC. If it is desired to include the character — in a character class, it should 
be first or last, thus: 

[-+0-9] 

matches all the digits and the two signs. 

In character classes, the '' operator must appear as the first character after the left 
bracket; it indicates that die resulting string is to be complemented with respect 
to the system’s character set. Thus 

[^abc] 

matches all characters except a, b, or c, including all special or control charac¬ 
ters; and 

t'‘a-zA-Z] 

is any character which is not a letter. The \ character provides the usual escapes 
within character class brackets. 

To match almost any character, the operator character 


(period) is the class of all characters except newline. Escaping into octal is possi¬ 
ble although non-portable: 

[\40-\176] 

matches all printable characters in the ASCII character set, from octal 40 (blank) 
to octal 176 (tilde). 
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Optional Expressions 


Repeated Expressions 


Alternation and Grouping 


Context Sensitivity 


The operator ? indicates an optional element of an expression. Thus 
ab?c 

matches either ac or abc. 

Repetitions of classes are indicated by the operators * and +. 
a* 

is any number of consecutive a characters, including zero; while 
a+ 

is one or more instances of a. For example, 

[a-z]+ 

is all strings of lower case letters. And 
[A—Za-z][A-Za-zO—9]* 

indicates all alphanumeric strings with a leading alphabetic character. This is a 
typical expression for recognizing identifiers in computer languages. 

The operator | indicates alternation: 

(ab I cd) 

matches either ab or cd. Note that parentheses are used for grouping, although 
they are not necessary on the outside level; 

ab I cd 

would have sufficed. Parentheses can be used for more complex expressions: 
(ab I cd+) ?(ef)* 

matches such strings as abef ef, ef ef ef, cdef, or cddd ; but not abc, 
abed, or abedef. 


lex recognizes a small amount of surrounding context. The two simplest opera¬ 
tors for this are " and $. If the first character of an expression is , the expres¬ 
sion is only be matched at the beginning of a line This can never conflict with the 
other meaning of , complementation of character classes, since that only 
applies within the [ ] operators. If the very last character is $, the expression is 
only be matched at the end of a line (when immediately followed by newline). 
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Repetitions and Definitions 


10.3. lex Actions 


The latter operator is a special case of the / operator character, which indicates 
trailing context. The expression 

ab/cd 

matches the string ab, but only if it, is followed by cd. Thus 
ab$ 

is the same as 
ab/\n. 

Left context is handled in lex by start conditions as explained in section 10.9 — 
Left Context-Sensitivity. If a rale is only to be executed when the lex automa¬ 
ton interpreter is in start condition x, the rale should be prefixed by 

<x> 

using the angle bracket operator characters. If we considered ‘being at the begin- 
lung of a line’ to be start condition ONE, then the '' operator would be equivalent 
to 


<ONE>. 

Start conditions are explained more fully below. 

The operators { } specify either repetitions (if they enclose numbers) or 
definition expansion (if they enclose a name). For example 

{digit} 

looks for a predefined string named digit and inserts it at that point in the 
expression. The definitions are given in the first part of the lex input, before the 
rales. In contrast, 

all,5} 

looks for 1 to 5 occurrences of a. 

Finally, initial % is special, being the separator for lex source segments. 

When an expression written as above is matched, lex executes the correspond¬ 
ing action. This section describes some features of lex which aid in writing 
actions. Note that there is a default action, which consists of copying the input to 
the output. This is performed on all strings not otherwise matched. Thus the 
lex programmer who wishes to absorb the entire input, without producing any 
output, must provide rales to match everything. When lex is being used with 
yacc, this is the normal situation. One may consider that actions are what is 
done instead of copying the input to the output; thus, in general, a rale which 
merely copies can be oinitted. Also, a character combination which is omitted 
from the rales and which appears as input is likely to be printed on the output, 
thus calling attention to the gap in the rales. 
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Actual Text that Matched 


Length of Matched Text 


One of the simplest things that can be done is to ignore the input. Specifying a 
C null statement, ; as an action does this. A frequent rule is 

[ \t\n] ; 

which ignores the three spacing characters (blank, tab, and newline). 

Another easy way to avoid writing actions is the action character |, which indi¬ 
cates that the action to be used for this rule is the action given for the next rule. 
The previous example could also have been written 


ff If 

"\t" 

"\n" 


with the same result. The quotes around \n and \t are not required. 

In more complex actions, the programmer often wants to know the actual text 
that matched some expression like [ a-z ] +. lex leaves this text in an external 
character array named yytext. 

Thus, to print the name found, a rule like 

[a-z]+ printf yytext); 

prints the string in yytext. The C function printf accepts a format argument 
and data to be printed; in this case, the format is ‘print string’ (% indicating data 
conversion, and s indicating string type), and the data are the characters in 
yytext. So this just places the matched string on the output This action is so 
common that it may be written as ECHO: 

[a-z]+ ECHO; 

is the same as the above. Since the default action is just to print the characters 
found, one might ask why give a rule, like this one, which merely specifies the 
default action? Such rules are often required to avoid matching some other rule 
which is not desired. For example, if there is a rule which matches read () it 
normally matches the instances of read contained in bread or readjust; to 
avoid this, a rule of the form [ a~z ] + is needed. This is explained further 
below. 

Sometimes it is more convenient to know the end of what has been found; hence 
lex also provides a count yyleng of the number of characters matched. To 
count both the number of words and the number of characters in words in the 
input, the programmer might write 

[a—zA—Z]+ {words++; chars += yyleng;} 

which accumulates in chars the number of characters in the words recognized. 
The last character in the string matched can be accessed by 

yytext[yyleng-1]. 
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yymore and yyless Occasionally, a lex action may decide that a rule has not recognized the correct 

span of characters. Two routines are provided to aid with this situation. First, 
yymore () can be called to indicate that the next input expression recognized is 
to be tacked on to the end of this input. Normally, the next input string would 
overwrite the current entry in yytext. Second, yyless (n) may be called to 
indicate that not all the characters matched by the currently successful expression 
are wanted right now. The argument n indicates the number of characters to be 
retained in yytext. Further characters previously matched are returned to the 
input This provides the same sort of lookahead offered by the / operator, but in 
a different form. 

Example: Consider a language which defines a string as a set of characters 
between quotation (") marks, and provides that to include a " in a string it must 
be preceded by a \. The regular expression which matches that is somewhat 

confusing, so that it might be preferable to write: 
-- 

if (yytext [yyleng-1] == '\\') 
yymore () ; 

else 

... normal programmer processing 

} 


which, when faced with a string such as abcXdef" first matches the five charac¬ 
ters "abcX ; then the call to yymore () tacks the next part of the string, 

"def , onto the end. Note that the final quote terminating the string should be 
picked up in the code labeled ‘normal processing’. 

The function yyless () might be used to reprocess text in various cir¬ 
cumstances. Consider the problem of resolving (in old-style C) the ambiguity of 
‘=-a’. Suppose it is desired to treat this as ‘=^ a’ but print a message. A rule 
might be 

- 

=-[a-zA-Z] { 

printf("Operator (=—) ambiguousXn"); 
yyless(yyleng-1); 

... action for =— ... 

} 

< _ > 


which prints a message, returns the letter after the operator to the input stream, 
and treats the operator as ‘=-’. Alternatively it might be desired to treat this as 
‘= -a’. To do this, just return the minus sign as well as the letter to the input: 

-- 

=-[a-zA-Z] { 

printf("Operator (=-) ambiguous\n"); 
yyless(yyleng-2); 

... action for = ... 

} 

_./ 
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performs the other interpretation. Note that the expressions for the two cases 
might more easily be written: 

=-/[A-Za-z] 

in the first case and 

=/” [A-Za-z] 

in the second; no backup would be required in the rule actioa It is not necessary 
to recognize the whole identifier to observe the ambiguity. The possibility of 
‘=-3’, however, makes 

=-/[" \t\n] 

a still better rule. 

In addition to these routines, lex also permits access to the I/O routines it uses. 
They are: 

1. input () which returns the next input character; 

2. output (c ) which writes the character c on the output; and 

3. unput (c ) pushes the character c back onto flie input stream to be read 
later by input (). 

By default these routines are provided as macro definitions, but the programmer 
can override them and supply private versions. These routines define the rela¬ 
tionship between external files and internal characters, and must all be retained or 
modified consistently. They may be redefined, to transmit input or output to or 
from strange places, including other programs or internal memory; but the char¬ 
acter set used must be consistent in all routines; a value of zero returned by 
input must mean end of file; and the relationship between unput and input 
must be retained or the lex lookahead will not work, lex does not look ahead 
at all if it does not have to, but every rule ending in + * ? or $ or containing / 
implies lookahead. Lookahead is also necessary to match an expression that is a 
prefix of another expression. See section 10.10 for a discussion of the character 
set used by lex. The standard lex library imposes a 100-character limit on 
backup. 

Another lex library routine that the programmer will sometimes want to 
redefine is yywrap () which is called whenever lex reaches an end-of-file. If 
yywrap returns a 1, lex continues with the normal wrapup on end of input. 
Sometimes, however, it is convenient to arrange for more input to arrive from a 
new source. In this case, the programmer should provide a yywrap which 
arranges for new input and returns 0. This instructs lex to continue processing. 
The default yywrap always returns 1. 

This routine is also a convenient place to print tables, summaries, etc. at the end 
of a program. Note that it is not possible to write a normal rule which recognizes 
end-of-file; the only access to this condition is through yywrap. 

In fact, unless a private version of input () is supplied a file containing nulls 
cannot be handled, since a value of 0 returned by input is taken to be end-of- 
file. 
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10.4, Ambiguous Source lex can handle ambiguous specifications. When more than one expression can 

Rules match the current input, lex chooses as follows; 

1. The longest match is preferred. 

2. Among rules which matched the same number of characters, the rule given 
first is preferred. 


Thus, suppose the mles 



to be given in that order. If the input is integers, it is taken as an identifier, 
because [a-z ] + matches 8 characters, while integer matches only 7. If the 
input is integer, both rules match 7 characters, and the keyword rule is 
selected because it was given ^rjf. Anything shorter (for example, int) will not 
match the expression integer, and so the identifier interpretation is used. 

The principle of preferring the longest match makes rules contaiiting expressions 
like . * dangerous. For example, 

might seem a good way of recognizing a string in single quotes. But it is an invi¬ 
tation for the program to read far ahead, looking for a distant single quote. 
Presented with the input 

'firat' quoted string here, 'second' here 
the above expression matches 

'first' quoted string here, 'second' 
which is probably not what was wanted. A better rule is of the form 
'[^'\n]*' 

which, on the above input, stops after 'first'. The consequences of errors like 
this are mitigated by the fact that the . operator does not match newline. Thus 
expressions like . * stop on the current line. Don’t try to defeat this with expres¬ 
sions like [. \n] + or equivalents; the lex generated program will try to read 
the entire input file, causing internal buffer overflows. 

Note that lex is normally partitioning the input stream, not searching for all pos¬ 
sible matches of each expression. This means that each character is accounted 
for once and only once. For example, suppose it is desired to count occurrences 
of both she and he in an input text. Some lex rules to do this might be 
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where the last two rules ignore everything besides he and she. Remember that 
does not include newline. Since she includes he, lex will normally not 
recognize the instances of he included in she, since once it has passed a she 
those characters are gone. 

Sometimes the programmer would like to override this choice. The action 
REJECT means ‘go do the next alternative.’ It executes whatever rule was 
second choice after the current rule. The position of the input pointer is adjusted 
accordingly. Suppose the programmer really wants to count the included 
instances of he: 


she 

{s++; REJECT;} 

N 

he 

{h++; REJECT;} 


\n 

1 



f 

- > 


these rules are one way of changing the previous example to do just that. After 
counting each expression, it is rejected; whenever appropriate, the other expres¬ 
sion is then counted. In this example, of course, the programmer could note that 
she includes he but not vice versa, and omit the REJECT action on he; in other 
cases, however, it would not be possible a priori to tell which input characters 
were in both classes. 


Consider the two rules 


--- 

a[bc]+ { . 

. ; REJECT;} 


a[cd]+ { .. 

.. ; REJECT;} 




—-_> 


If the input is ab, only the first rule matches, and on ad only the second matches. 
The input string accb matches the first rule for four characters and then the 
second rule for three characters. In contrast, the input accd agrees with the 
second rule for four characters and the first rule for three. 

In general, REJECT is useful whenever the purpose of lex is not to partition the 
input stream but to detect all examples of some items in the input, and the 
instances of these items may overlap or include each other. Suppose a digram 
table of the input is desired; normally the digrams overlap, that is the word the 
is considered to contain both th and he. Assuming a two-dimensional array 
named digram to be incremented, the appropriate source is shown below. 
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10,5. lex Source 
Definitions 


- — 

%% 

[a-z][a-z] {digram [yytext [0] ][yytext [1] ]++; REJECT;} 

• / 

\n 

^---— -J 

where the REJECT is necessary to pick up a letter pair beginning at every char¬ 
acter, rather than at every other character. 

Remember the format of the lex source: 

- \ 

(definitions) 

%% 

(rules) 

%% 

(programmer routines } 

V_> 


So far only the mles have been described. The programmer needs additional 
options, though, to define variables for use in his program and for use by lex. 
These can go either in the definitions section or in the rules section. 

Remember that lex is turning the mles into a program. Any source not inter¬ 
cepted by lex is copied into the generated program. There are three classes of 
such things. 

1. Any line which is not part of a lex mle or action which begins with a blank 
or tab is copied into the lex-generated program. Such source input prior to 
the first %% delimiter is external to any function in the code; if it appears 
immediately after the first %%, it appears in an appropriate place for 
declarations in the function written by lex which contains the actions. This 
material must look like program fragments, and should precede the first lex 
mle. 

As a side effect of the above, lines which begin with a blank or tab, and 
which contain a comment, are passed through to the generated program. 

This can be used to include comments in either the lex source or the gen¬ 
erated code. The comments should follow the host language convention. 

2. Anything included between lines containing only the delimiters % { and %} 
is copied out as above. The delimiters are discarded. This format permits 
entering text like preprocessor statements that must begin in column 1, or 
copying lines that do not look like programs. 

3. Anything after the third %% delimiter, regardless of formats, etc., is copied 
out after the lex output. 

Definitions intended for lex are given before the first %% delimiter. Any line in 
this section not contained between %{ and %}, and beginning in column 1, is 
assumed to define lex substitution strings. The format of such lines is 

name translation 

and it associates the string given as a translation with the name. The name and 
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translation must be separated by at least one blank or tab, and the name must 
begin with a letter. The translation can then be invoked by the -[name}- syntax in 
a mle. Using {D} for the digits and {E} for an exponent field, for example, 
might abbreviate rules to recognize numbers: 


/ - — -- --- —. 

D [0-9] 


E [DEde][-+]?{D}+ 


%% 


{D}+ printf("integer”); 


{D)+”."{D}*({E})? 1 


{D}*”.”{D}+({E})? 1 


{D}+{E} printf("real”); 



j 


Note the first two rules for real numbers; both require a decimal point and con¬ 
tain an optional exponent field, but the first requires at least one digit before the 
decimal point and the second requires at least one digit after the decimal point. 
To correctly handle the problem posed by a FORTRAN expression such as 
35 . EQ. I, which does not contain a real number, a context-sensitive rule such as 

[0—9]+/"."EQ printf("integer"); 

could be used in addition to the normal rule for integers. 

The definitions section may also contain other commands, including the selection 
of a host language, a character set table, a list of start conditions, or adjustments 
to the default size of arrays within lex itself for larger source programs. These 
possibilities are discussed below under section 10.11 — Summary of Source For¬ 
mat. 

10.6. Using lex There are two steps in compiling a lex source program. First, the lex source 

must be turned into a generated program in the host general-purpose language. 
Then this program must be compiled and loaded, usually with a library of lex 
subroutines. The generated program is on a file named lex. yy. c. The I/O 
library is defined in terms of the C standard library in section 3 of the SunOS 
R^erence Manual. 

The lex library is accessed by the loader flag -11. 

So an appropriate set of commands is: 

r ““ - — —--— 

tutorial^ lex source 
tutoriai% cc lex.yy.c -11 

^-—. .. . 

The resulting program is placed on the usual file a. out for later execution. To 
use lex with yacc see below. Although the default lex I/O routines use the C 
standard library, the lex automata themselves do not do so; if private versions 
of input, output, and unput are given, the library can be avoided, lex has 
several options which are described in the lex(l) manual page. 
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10.7. lex and yacc 


10.8. Examples 


If you want to use lex with yacc, note that what lex writes is a program 
named yylex (), the name required by yacc for its analyzer. Normally, the 
default main program in the lex library calls this routine, but if yacc is loaded, 
and its main program is used, yacc calls yylex (). 

In this case each lex rule should end with 

return(token); 

to return the appropriate token value. 

An easy way to get access to yacc’s names for tokens is to compile the lex 
output file as part of the yacc output file by placing the line 

# include "lex.yy.c" 

in the last section of yacc input. Supposing the grammar to be named ‘good’ 
and the lexical mles to be named ‘better’ the command sequence can just be: 
-- 

tutorial% yacc good 
tutorial% lex better 
tutorial% cc y.tab.c -11 
tutorial% 

_ J 

The lex and yacc programs can be generated in either order. 


As a trivial problem, consider copying an input file while adding 3 to every non¬ 
negative number divisible by 7. Here is a suitable lex source program 



to do just that. The rule [ 0- 9 ] + recognizes strings of digits; atoi () converts 
the digits to binary and stores the result in k. 

The operator % (remainder) is used to check whether k is divisible by 7; if it is, it 
is incremented by 3 as it is written out. It may be objected that this program will 
alter such input items as 4 9.63 or X7. Furthermore, it increments the absolute 
value of all negative numbers divisible by 7. To avoid this, just add a few more 
mles after the active one, as shown below. 



microsystems 


Revision A of 9 May 1988 





Numerical strings containing a or preceded by a letter are picked up by one of 
the last two rules, and not changed. The if-else has been replaced by a C 
conditional expression to save space; the form a?b: c means ‘if a then b else 
c'. 


For an example of statistics gathering, here is a program which constmcts a his¬ 
togram of the lengths of words, where a word is defined as a string of letters. 



This program accumulates the histogram, while producing no output. At the end 
of the input it prints the table. The final statement return (1); indicates that 
lex is to perform wrapup. If yywrap returns zero (false) it implies that further 
input is available and the program is to continue reading and processing. To pro¬ 
vide a yywrap that never returns tme causes an infinite loop. 

As a larger example, here are some parts of a program written by N. L. Schryer 
to convert double-precision FORTRAN to single-precision FORTRAN. Because 
FORTRAN does not distinguish upper and lower case letters, this routine begins 
by defining a set of classes including both cases of each letter: 
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An additional class recognizes white space: 

W [ \t]* 

The first rule changes double precision to real, or DOUBLE PRECI¬ 
SION to REAL. 

- - ^ ^ ' 

{d){o}{u}{b}{l}{e}{W}{p}{r}{e}{c}{i}{s}{i}{o}{n} { 
printf (yytext [0] =='d'? "real" : "REAL"); 

1 

Care is taken throughout this program to preserve the case (upper or lower) of the 
original program. The conditional operator is used to select the proper form of 
the keyword. The next rule copies continuation card indications to avoid confus¬ 
ing them with constants: 

"[^ 0] ECHO; 

In the regular expression, the quotes surround the blanks. It is interpreted as 
‘beginning of line, then five blanks, then anything but blank or zero.’ Note the 
two different meanings of ". There follow some mles to change double- 
precision constants to ordinary floating constants. 


r 


A 

[0-9]+{W} {d} {W} [+-]?{W} [0-9]+ 1 

[0-9]+{W}"."{W} {d} {W} [+-]?{W} [0-9] + 

1 


"."{W} [0-9]+[W}{d}{W} [+-]?{W} [0-9] + 

{ 


/* convert constants */ 
for (p=yytext; *p != 0; p++) 

{ 

if (*p == 'd' II *p == 'D') 
*p=+ 'e~ 'd'; 

ECHO; 

} 

V _ 


j 


After the floating point constant is recognized, it is scaimed by the for loop to 
find the letter d or D. The program then adds 'e'-'d', which converts it to the 
next letter of the alphabet. The modified constant, now single-precision, is writ¬ 
ten out again. There follow a series of names which must be respelled to remove 
their initial d. By using the array yytext the same action suffices for all the 
names (only a sample of a rather long list is given here). 
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10.9. Left Context- 
Sensitivity 



Another list of names must have initial d changed to initial a: 


f ---- - - 


{d}{l}{o}{g} 1 


{d}{l}{o}{g)10 1 


{d}{m} (i) {n}l | 


{d}{m} {a} {x}l { 


yytext[0] =+ V - 'd'; 


ECHO; 

1 


J 

j 


And one routine must have initial d changed to initial r: 


- -- 

{d)l{m}{a}{c}{h} 

{yytext[0] =+ V - 'd'; 

\ 

ECHO; 

} 


V 


To avoid such names as dsinx being detected as instances of dsin, some final 
rules pick up longer words as identifiers and copy some surviving diaracters: 


^ -- - — - 


[A-Za-z][A-Za-zO-9]* I 


[0-9]+ 1 


\n 1 


ECHO; 


V 

_ __ > 


Note that this program is not complete; it does not deal with the spacing prob¬ 
lems in FORTRAN or with the use of keywords as identifiers. 


Sometimes it is desirable to have several sets of lexical rules to be applied at dif¬ 
ferent times in the input. For example, a compiler preprocessor might distin¬ 
guish preprocessor statements and analyze them differently from ordinary state¬ 
ments. This requires sensitivity to prior context, and there are several ways of 
handling such problems. The " operator, for example, is a prior context opera¬ 
tor, recognizing immediately preceding left context just as $ recognizes immedi¬ 
ately following light context. Adjacent left context could be extended, to pro¬ 
duce a facility similar to that for adjacent right context, but it is unlikely to be as 
useful, since often the relevant left context appeared some time earlier, such as at 
the beginning of a line. 

This section describes three means of dealing with different environments: a sim¬ 
ple use of flags, when only a few rules change from one environment to another, 
the use of start conditions on rules, and the possibility of making multiple lexical 
analyzers all mn together. In each case, there are rules which recognize the need 
to change the environment in which the following input text is analyzed, and set 
some parameter to reflect the change. This may be a flag explicitly tested by the 
programmer’s action code; such a flag is the simplest way of dealing with the 
problem, since lex is not involved at all. It may be more convenient, however, 
to have lex remember the flags as initial conditions on the mles. Any mle may 
be associated with a start condition. It is only be recognized when lex is in that 
start condition. The current start condition may be changed at any time. Finally, 
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if the sets of rules for the different environments are very dissimilar, clarity may 
be best achieved by writing several distinct lexical analyzers, and switching from 
one to another as desired. 


Consider the following problem: copy the input to the output, changing the word 
magic to first on every line which begins with the letter a, changing magic 
to second on every line which begins with the letter b, and changing magic to 
third on every line which begins with the letter c. All other words and all 
other lines are left unchanged. 


These rules are so simple that the easiest way to do this job is with a flag: 


%% 

int flag; 

"a 

{flag = V; ECHO;} 

"b 

{flag = V; ECHO;} 

"c 

{flag = V; ECHO;} 

\n 

{flag = 0 ; ECHO;} 

magic 

{ 

switch (flag) 

{ 

case 'a': printf ("first”! 
case 'b': printf ("second' 
case 'c': printf ("third"! 
default: ECHO; break; 


break; 

; break; 
break; 


) 

} 


should be adequate. 

To handle the same problem with start conditions, each start condition must be 
introduced to lex in the definitions section with a line reading 

%Start namel name2 ... 

where the conditions may be named in any order. The word Start may be 
abbreviated to s or S. The conditions may be referenced at the head of a mie 
with the <> brackets: 

<namel>expression 

is a rule which is only recognized when lex is in the start condition namel. To 
enter a start condition, execute the action statement 

BEGIN namel; 

which changes the start condition to namel. To resume the normal state, 

BEGIN 0; 

which resets to the initial condition of the lex automaton interpreter. A rule 
may be active in several start conditions: 

<namel,name2,name3> 

is a legal prefix. Any rule not beginning with the <> prefix operator is always 
active. 
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The same example as before can be written: 


r - 



A 

%START 

AA BB CC 


%% 




'^a 

(ECHO; 

BEGIN AA;} 



(ECHO; 

BEGIN BB;} 


"c 

(ECHO; 

BEGIN CC;} 


\n 

(ECHO; 

BEGIN 0;} 


<AA>magic 

printf("first”); 


<BB>inagic 

printf("second”); 


<CC>magic 

printf("third”); 







where the logic is exactly the same as in the previous method of handling the 
problem, but lex does the work rather than the programmer’s code. 


10.10. Character Set The programs generated by lex handle character I/O only through the routines 

input , output , and unput. Thus the character representation provided in 
these routines is accepted by lex and employed to return values in yytext. 

For internal use a character is represented as a small integer which, if the stan¬ 
dard library is used, has a value equal to the integer value of the bit pattern 
representing the character on the host computer. Normally, the letter a is 
represented in the same form as the character constant 'a'. 

If this interpretation is changed, by providing I/O routines which translate the 
characters, lex must be told about it, by giving a translation table. This table 
must be in the definitions section, and must be bracketed by two lines containing 
only ‘%T’. The table contains lines of the form 

{integer} (character string} 

which indicate the value associated with each character. Thus the next example 
Figure 10-3 Sample character table. 



maps the lower and upper case letters together into the integers 1 through 26, 
newline into 27, + and - into 28 and 29, and the digits into 30 through 39. Note 
the escape for newline. If a table is supplied, every character that is to appear 
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o 

either in the rules or in any valid input must be included in the table. No charac¬ 
ter may be assigned the number 0, and no character may be assigned a bigger 
number than the size of the hardware character set. 

10.11. Summary of Source The general form of a lex source file is: 

Format 


The definitions section contains a combination of 

1. Definitions, in the form ‘name space translation’. 

2. Included code, in the form ‘space code’. 


3. Included code, in the form 



o 

4. Start condition declarations, given in the form ^ 

%S namel name2 ... 


5. Character set tables, in the form 
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6. Changes to internal array sizes, in the form 
%x nnn 

where nnn is a decimal integer representing an array size and x selects the 
parameter as follows: 

Table 10-1 Changing Internal Array Sizes in lex 


Letter 

Parameter 

P 

positions 

n 

states 

e 

tree nodes 

a 

transitions 

k 

packed character classes 

o 

output array size 


Lines in the mles section have the form ‘expression action’ where the action 
may be continued on succeeding lines by using braces to delimit it. 

Regular expressions in lex use the following operators: 

Table 10-2 Regular Expression Operators in lex 


Operator 

Meaning 

X 

the character "x" 

"x" 

an "x", even if x is an operator 

\x 

an "x", even if x is an operator 

[xy] 

the character x or y 

[x-z] 

the characters x, y or z 

[^X] 

any character but x 

• 

any character but newline 

"x 

an X at the beginning of a line 

<y>x 

an X when lex is in start condition y 

x$ 

an X at the end of a line 

X? 

an optional x 

X* 

0,1,2,... instances of x 

x+ 

1,2,3,... instances of x 

x|y 

an X or a y 

(X) 

an X 

x/y 

an X but only if followed by y 

{xx} 

the translation of xx from the definitions section 

x{m, n} 

m through n occurrences of x 
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10.12. Caveats and Bugs There are pathological expressions which produce exponential growth of the 

tables when converted to deterministic automata; fortunately, they are rare. 

RE JECT does not rescan the input; instead it remembers the results of the previ¬ 
ous scaa This means that if a rule with trailing context is found, and REJECT is 
executed, the programmer must not have used unput to change the characters 
forthcoming from the input stream. This is the only restriction on the 
programmer’s ability to manipulate the not-yet-processed input. 
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yacc — Yet Another Compiler- 

Compiler 

Computer program input generally has some structure; in fact, eveiy computer 
program that does input can be thought of as defining an ‘input language’ which 
it accepts. An input language may be as complex as a programming language, or 
as simple as a sequence of numbers. Unfortunately, usual input facilities are lim¬ 
ited, difficult to use, and often are lax about checking their inputs for validity. 

yacc provides a general tool for describing the input to a computer program. 

The yacc programmer specifies the structure of the input, together with code to 
be invoked as each item is recognized, yacc turns such a specification into a 
subroutine that handles the input process; frequently, it is convenient and 
appropriate to have most of the flow of control in the programmer’s application 
handled by this subroutine. 

The input subroutine produced by yacc calls a programmer-supplied routine to 
return the next basic input item. Thus, the programmer can specify his input in 
terms of individual input characters, or in terms of higher-level constmcts such as 
names and numbers. The programmer-supplied routine may also handle 
idiomatic features such as comment and continuation conventions, which typi¬ 
cally defy easy grammatical specification. 

The class of specifications that yacc accepts is a very general one: LALR(l) 
grammars with disambiguating rules. 

In addition to compilers for C, FORTRAN, APL, Pascal, Ratfor, etc., yacc has 
also been used for less conventional languages, including a phototypesetter 
language, several desk calculator languages, a document retrieval system, and a 
FORTRAN debugging system. 

yacc provides a general tool for imposing structure on the input to a computer 
program. The yacc programmer prepares a specification of the input process; 
this includes mles describing the input structure, code to be invoked when these 
rules are recognized, and a low-level routine to do the basic input, yacc then 
generates a function to control the input process. This function, called a parser, 
calls the programmer-supplied low-level input routine (the lexical analyzer) to 
pick up the basic items (called tokens) from the input stream. These tokens are 
organized according to the input structure mles, called grammar rules', when one 
of these mles has been recognized, then programmer code supplied for this mle, 
an action, is invoked; actions have the ability to return values and make use of 
the values of other actions. 
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yacc generates its actions and output subroutines in C. Moreover, many of the 
syntactic conventions of yacc follow C. 

The heart of the yacc input specification is a collection of grammar mles. Each 
rule describes an allowable stracture and gives it a name. For example, one 
grammar rule might be 

date : month_name day ', ' year ; 

Here, date, monthjiame, day, and year represent structures of interest in the 
input process; presumably, monthjtame, day, and year ate defined elsewhere. 
The comma is enclosed in single quotes — implying that the comma is to 
appear literally in the input. The colon and semicolon merely serve as punctua¬ 
tion in the rule, and have no significance in controlling the input. Thus, with 
proper definitions, the input 

July 4, 1776 

might be matched by the above rule. 

An important part of the input process is carried out by the lexical analyzer. This 
routine reads the input stream, recognizing the lower-level structures, and com¬ 
municates these tokens to the parser. For historical reasons, a stracture recog¬ 
nized by the lexical analyzer is called a terminal symbol, while the stracture 
recognized by the parser is called a nonterminal symbol. To avoid confusion, ter¬ 
minal symbols are referred to as tokens. 


There is considerable leeway in deciding whether to recognize structures using 
the lexical analyzer or grammar rales. For example, the rales 



month_name- : 

'J' 

'a' 

'n' 

r 

\ 


month__name : 


'e' 

'h' 

f 



month_name : 



'c' 

7 


^_ 






_ J 


might be used in the above example. The lexical analyzer would only need to 
recognize individual letters, and monthjtame would be a nonterminal symbol. 
Such low-level rales tend to waste time and space, and may complicate the 
specification beyond yacc’s ability to deal with it. Usually, the lexical analyzer 
would recognize the month names, and return an indication that a monthjtame 
was seen; in this case, monthjtame would be a token. 

Literal characters such as must also be passed through the lexical analyzer, 
and are also considered tokens. 
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Specification files are very flexible. It is realively easy to add to the above exam¬ 
ple the rule 

date : month day year 
allowing 

1 / ^ / 1776 
as a synonym for 

July 4, 1776 

In most cases, this new rule could be ‘slipped in’ to a working system with 
minimal effort and little danger of disrupting existing input. 

The input being read may not conform to the specifications. These input errors 
are detected as early as is theoretically possible with a left-to-right scan; thus, not 
only is the chance of reading and computing with bad input data substantially 
reduced, but the bad data can usually be quickly found. Error handling, provided 
as part of the input specifications, permits the reentry of bad data, or the con¬ 
tinuation of the input process after skipping over the bad data. 

In some cases, yacc fails to produce a parser when given a set of specifications. 
For example, the specifications may be self-contradictory, or they may require a 
more powerful recognition mechanism than that available to yacc. The former 
cases represent design errors; the latter cases can often be corrected by making 
the lexical analyzer more powerful, or by rewriting some of the grammar rules. 
While yacc cannot handle all possible specifications, its power compares favor¬ 
ably with similar systems; moreover, the constructions which are difficult for 
yacc to handle are also frequently difficult for human beings to handle. Some 
users have reported that the discipline of formulating valid yacc specifications 
for their input revealed errors of conception or design early in the program 
development. 

The next several sections describe the basic process of preparing a yacc 
specification; Section 11.1 describes the preparation of grammar rules. Section 
11.2 the preparation of the programmer-supplied actions associated with these 
rules, and Section 11.3 the preparation of lexical analyzers. Section 11.4 
describes the operation of the parser. Section 11.5 discusses various reasons why 
yacc may be unable to produce a parser from a specification, and what to do 
about it. Section 11.6 describes a simple mechanism for handling operator pre¬ 
cedences in arithmetic expressions. Section 11.7 discusses error detection and 
recovery. Section 11.8 discusses the operating environment and special features 
of the parsers yacc produces. Section 11.9 gives some suggestions which 
should improve the style and efficiency of the specifications. Section 11.10 
discusses some advanced topics. Section 11.11 has a brief example, and section 
11.12 gives a summary of the yacc input syntax. Section 11.13 gives an exam¬ 
ple using some of the more advanced features of yacc, and, finally, section 
11.14 describes mechanisms and syntax no longer actively supported, but pro¬ 
vided for historical continuity with older versions of yacc. 
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11.1. Basic Specifications Names refer to either tokens or nonterminal symbols, yacc requires token 

names to be declared as such. In additiqn, for reasons discussed in Section 11.3, 
it is often desirable to include the lexical analyzer as part of the specification file; 
it may be useful to include other programs as well. Thus, every specification file 
consists of three sections: the declarations, (grammar) rules, and programs. The 
sections are separated by double percent %% marks. The percent % is generally 
used in yacc specifications as an escape character. 


In other words, a full specification file looks like 


r 


declarations 


%% 


rules 




programs 


V_ 

_/ 


The declaration section may be empty. Moreover, if the programs section is 
omitted, the second %% mark may be omitted also; thus, the smallest legal yacc 
specification is 


r 


A 


%% 



rules 




_/ 


Spaces (also called blanks), tabs, and newlines are ignored except that they may 
not appear in names or multi-character reserved symbols. Comments may appear 
wherever a name is legal — they are enclosed in /* . , . */, as in C and 
PL/I. 

The rules section is made up of one or more grammar rules. A grammar mle has 
the form: 

- ^^ 

A : BODY ; 

s_/ 


A represents a nonterminal name, and BODY represents a sequence of zero or 
more names and literals. The colon and the semicolon are yacc punctuation. 

Names may be of arbitrary length, and may be made up of letters, dot under¬ 
score and non-initial digits. Upper and lower case letters are distinct. The 
names used in the body of a grammar rule may represent tokens or nonterminal 
symbols. 
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A literal consists of a character enclosed in single quotes As in C, the 
backslash ‘\’ is an escape character within literals, and all the C escapes are 
recognized: 


'\n' 

newline 

'\r' 

return 


single quote ' 

'W' 

backslash 

'\t' 

tab 

'\b' 

backspace 

'\f' 

form feed 

'\xxx' 

'xxx' in octal 


For a number of technical reasons, the (TOO character ('\0' or 0) should never 
be used in grammar rules. 


If there are several grammar mles with the same left hand side, the vertical bar T 
can be used to avoid rewriting the left hand side. In addition, the semicolon at 
the end of a rule can be dropped before a vertical bar. Thus the grammar rules 



It is not necessary that all grammar rules with the same left side appear together 
in the grammar rules section, although it makes the input much more readable, 
and easier to change. 


If a nonterminal symbol matches the empty string, this can be indicated in the 
obvious way: 

empty : ; 

Names representing tokens must be declared; this is most simply done by writing 
%token namel naine2 . . . 

in the declarations section. See Sections 3,5, and 6 for much mote discussion. 
Every name not defined in the declarations section is assumed to represent a non¬ 
terminal symbol. Every nonterminal symbol must appear on the left side of at 
least one rule. 


Of all the nonterminal symbols, one, called the start symbol, has particular 
importance. The parser is designed to recognize the start symbol; thus, this 
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symbol represents the largest, most general strucmre described by the grammar 
rules. By default, the start symbol is taken to be the left hand side of the first 
grammar rule in the rules section. It is possible, and in fact desirable, to declare 
the start symbol explicitly in the declarations section using the %start keyword: 

%start symbol 

The end of the input to the parser is signaled by a special token, called the end- 
marker. If the tokens up to, but not including, the endmarker form a stmcture 
which matches the start symbol, the parser function returns to its caller after the 
endmarker is seen; it accepts the input. If the endmarker is seen in any other 
context, it is an error. 

It is the job of the programmer-supplied lexical analyzer to return the endmarker 
when appropriate — see Section 11.3, below. Usually the endmarker represents 
some reasonably obvious I/O status, such as ‘end-of-file’ or ‘end-of-record’. 

11,2. Actions With each grammar rule, the programmer may associate actions to be performed 

each time the rule is recognized in the input process. These actions may return 
values, and may obtain the values returned by previous actions. Moreover, the 
lexical analyzer can return values for tokens, if desired. 


An action is an arbitrary C statement, and as such can do input and output, call 
subprograms, and alter external vectors and variables. An action is specified by 
one or more statements, enclosed in curly braces and For example. 


/- 



-\ 

A 

\ _ 

• 

' ( ' B ') ' 

{ 

hello( If ”abc” ); } 


and 





A 

XXX 


YYY ZZZ 




{ 

printfC'a message\n”) ; 
flag = 25; } 




- - J 


are grammar rules with actions. 

To facilitate easy communication between the actions and the parser, the action 
statements are altered slightly. The dollar sign symbol *$’ is used as a signal to 
yacc in this context. 

To return a value, the action normally sets the pseudo-variable ‘$$’ to some 
value. For example, an action that does nothing but return the value 1 is 

{ $$ = 1 ; } 

To obtain the values returned by previous actions and the lexical analyzer, the 
action may use the pseudo-variables $1, $2,..., which refer to the values 
returned by the components of the right side of a rule, reading from left to right. 
Thus, if the rule is 

-- 

A : BCD; 

- -- ^ 


msun 

Xr microsystems 
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for example, then $ 2 has the value returned by C, and $ 3 the value returned bv 
D. 


As a more concrete example, consider the rule 


expr : '(' expr ') ' 


The value returned by this rule is usually the value of the expr in parentheses. 
This can be indicated by 






A 


expr 

• 

' (' expr ') ' 

{ $$ = $2 ; ) 

--- > 


By default, the value of a rule is the value of $1 (the first element in it). Thus, 
grammar rules of the form 


A : B 


frequently need not have an explicit action. 


In the examples above, all the actions came at the end of their rules. Sometimes, 
it is desirable to get control before a rule is fully parsed, yacc permits an action 
to be written in the middle of a rule as well as at the end. This rule is assumed to 
return a value, accessible through the usual $ mechanism by the acdons to the 
right of it. In turn, it may access the values returned by the symbols to its left. 
Thus, in the mle 


— 

A : 

V 

B 

{ $$ = 1; 

C 

{ X = $2; 

} 

y = $3; ) 

the effect is to set. 

r to 1, and y to the value returned by C. 

Actions that do not terminate a mle are actually handled by yacc by manufac- 

tuiing a new nonterminal symbol name, and a new mle matching this name to the 

empty string. The interior action is the action triggered off by recognizing this 

added mle. yacc actually treats the above example as if it had been written: 

$ACT : 

/♦ empty */ 

.S 


{ $$ = 1; 

} 

A : 

B $ACT C 



{ X = $2; 

y = $3; } 

V 


___; 


In many applications, output is not done directly by the actions; rather, a da ta 
structure, such as a parse tree, is constructed in memory, and transformations are 
applied to it before output is generated. Parse trees are particularly easy to 
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construct, given routines to build and maintain the tree structure desired. For 
example, suppose there is a C function node, written so that the call 


- - 

A 

node( L, nl^ n2 ) 



j 


creates a node with label L, and descendants nl and n2, and returns the index of 
the newly created node. The parse tree can be built by supplying actions such as: 


f 





expr 

: 

expr expr 




{ $$ = node! $1, $3 ); } 

s,_ 





in the specification. 

The programmer may define other variables to be used by the actions. Declara¬ 
tions and definitions can appear in the declarations section, enclosed in the marks 
and *%}’. These declarations and definitions have global scope, so they are 

known to the action statements and the lexical analyzer. For example, 

- --- 

%{ int variable =0; %} 

v——___> 

could be placed in the declarations section, making variable accessible to all 
of the actions. The yacc parser uses only names beginning in ‘yy’; the pro¬ 
grammer should avoid such names. 

In these examples, all the values are integers: a discussion of values of other 
types will be found in Section 11.10. 

11.3. Lexical Analysis The programmer must supply a lexical analyzer to read the input stream and 

communicate tokens (with values, if desired) to the parser. The lexical analyzer 
is an integer-valued function called yylex (). The function returns an integer, 
the token number, representing the kind of token read. If there is a value associ¬ 
ated with that token, it should be assigned to the external variable yylval (). 

The parser and the lexical analyzer must agree on these token numbers in order 
for communication between them to take place. The numbers may be chosen by 
yacc, or chosen by the programmer. In either case, the ‘# define’ mechanism of 
C is used to allow the lexical analyzer to return these numbers symbolically. For 
example, suppose that the token name DIGIT has been defined in the declara¬ 
tions section of the yacc specification file. The relevant portion of the lexical 
analyzer might look like: 
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---- 


yylex(){ 


extern int yylval; 


int c; 


c = getcharO; 


switch( c ) { 


case 'O': 


case '1': 


case '9': 


yylval = c-'O'; 


return{ DIGIT ); 


} 



.. 


The intent is to return the token number of DIGIT, and a value equal to the 
numerical value of the digit. Provided that the lexical analyzer cocte is placed in 
the programs section of the specification file, the identifier DIGIT will be 
defined as the token number associated with the token DIGIT. 

This mechanism leads to clear, easily modified lexical analyzers; the only pitfall 
is the need to avoid using any token names in the grammar that are reserved or 
significant in C or the parser; for example, the use of if or while as token 
names will almost certainly cause severe difficulties when the lexical analyzer is 
compiled. The token name error is reserved for error handling, and should not 
be used naively (see Section 11.7. 

As mentioned above, the token numbers may be chosen by yacc or by the pro¬ 
grammer. In the default situation, the numbers are chosen by yacc. The default 
token number for a literal character is the numerical value of the character in the 
local character set. Other names are assigned token numbers starting at 257. 

To assign a token number to a token (including literals), the first appearance of 
the token name or literal in the declarations section can be immediately followed 
by a nonnegative integer. This integer is taken to be the token number of the 
name or literal. Names and literals not defined by this mechanism retain their 
default definition. It is important that all token numbers be distinct 

For historical reasons, the endmarker must have token number 0 or negative. 

This token number cannot be redefined by the programmer; thus, all lexical 
analyzers should be prepared to return 0 or negative as a token number upon 
reaching the end of their input. 

A very useful tool for constructing lexical analyzers is the lex program developed 
by Mike Lesk® and described in chapter NumberOf(Lex_Lexical_Analyzer), 
TitleOf(Lex_Lexical_Analyzer). These lexical analyzers are designed to work 
in close harmony with yacc parsers. The specifications use regular expressions 
instead of grammar rules, lex can be easily used to produce quite complicated 
lexical analyzers, but there remain some languages (such as FORTRAN) which do 
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11.4. How the Parser 
Works 


shift Action 


reduce Action 


not fit any theoretical framework, and whose lexical analyzers must be crafted by 
hand. 



yacc turns the specification file into a C program, which parses the input 
according to the specification given. The algorithm used to go from the 
specification to the parser is complex, and will not be discussed here (see the 
references for more information). The parser itself, however, is relatively simple, 
and understanding how it works, while not strictly necessary, will nevertheless 
make treatment of error recovery and ambiguities much more comprehensible. 

The parser produced by yacc consists of a finite-state machine with a stack. 

The parser can read and remember the next input token (called the lookahead 
token). The current state is always the one on the top of the stack. The states of 
the finite-state machine are given small integer labels; initially, the machine is in 
state 0, the stack contains only state 0, and no lookahead token has been read. 

The machine has only four actions available to it, called shift, reduce, accept, 
and error. A move of the parser is done as follows; 


1. Based on its current state, the parser decides whether it needs a lookahead 
token to decide what action should be done; if it needs one, and does not 
have one, it calls yylex () to obtain the next token. 


2. Using the current state, and the lookahead token if needed, the parser decides 
on its next action, and carries it out. This may result in states being pushed 
onto the stack, or popped off the stack, and in the lookahead token being 
processed or left alone. 



The shift action is the most common action the parser takes. Whenever a shift 
action is taken, there is always a lookahead token. For example, in state 56 there 
may be an action: 



which says, in state 56, if the lookahead token is IF, the current state (56) is 
pushed down on the stack, and state 34 becomes the current state (on the top of 
the stack). The lookahead token is cleared. 


The reduce action keeps the stack from growing without bound. Reduce actions 
are appropriate when the parser has seen the right hand side of a grammar mle, 
and is prepared to announce that it has seen an instance of the rule, replacing the 
right hand side by the left hand side. It may be necessary to consult the looka¬ 
head token to decide whether to reduce, but usually it is not; in fact, the default 
action (represented by a ‘.’) is often a reduce action. 


Reduce actions are associated with individual grammar rules. Grammar rules are 
also given small integer numbers, leading to some confusion. The action 


. reduce 18 

_ > 
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refers to grammar rule 18, while the action 

IF shift 34 

^ .. 

N 

refers to state 34. 


Suppose the rule being reduced is 


A : X y 2 



The reduce action depends on the left hand symbol (A in this case), and the 
number of symbols on the right hand side (three in this case). To reduce, first 
pop off the top three states from the stack (In general, the number of states 
popped equals the number of symbols on the right side of the rule). In effect, 
these states were the ones put on the stack while recognizing x, y, and z, and no 
lonpr serve any useful purpose. After popping these states, a state is uncovered 
which was the state the parser was in before beginning to process the mle. Using 
this uncovered state, and the symbol on the left side of the mle, perform what is 
in effect a shift of A. A new state is obtained, pushed onto the stack, and parsing 
continues. There are significant differences between the processing of the left 
hand symbol and an ordinaiy shift of a token, however, so this action is called a 
goto action. In particular, the lookahead token is cleared by a shift, and is not 
affected by a goto. In any case, the uncovered state contains an entry such as: 


A goto 20 


which pushes state 20 onto the stack, and becomes the current state. 

In effect, the reduce action ‘turns back the clock’ in the parse, popping the states 
off the Stack to go back to the state where the right hand side of the mle was first 

seen. The parser then behaves as if it had seen the left side at that time. If the 

right hand side of the mle is empty, no states are popped off the stack: the 
uncovered state is in fact the current state. 

The reduce action is also important in the treatment of programmer-supplied 
actions and values. When a mle is reduced, the code supplied with the mle is 
executed before the stack is adjusted. In addition to the stack holding the states, 
another stack, mnning in parallel with it, holds the values returned from the lexi¬ 
cal analyzer and the actions. When a shift takes place, the external variable 
yylvalO is copied onto the value stack. After the return from the 
programmer’s code, the reduction is carried out. When the goto action is done, 
the external variable yyval () is copied onto the value stack. The pseudo¬ 
variables $1, $2, etc., refer to the value stack. 

accept and error Actions The other two parser actions are concepmally much simpler. The accept action 

indicates that the entire input has been seen and that it matches the specification. 
This action appears only when the lookahead token is the endmarker, and indi¬ 
cates that the parser has successfully done its job. The error action, on the other 
hand, represents a place where the parser can no longer continue parsing 
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according to the specification. The input tokens it has seen, together with the 
lookahead token, cannot be followed by anything that would result in a legal 
input. The parser reports an error, and attempts to recover the situation and 
resume parsing: the error recovery (as opposed to the detection of error) will be 
covered in Section 11.7. 


It is time for an example! Consider the specification 
- 


%token 

DING 

DONG DELL 

9-S' 



rhyme 


sound place 

sound 

/ 

DING DONG 

place 

/ 

DELL 


V_ ) 


When yacc is invoked with the —v option, a file called yx>utput is produced, 
with a human-readable description of the parser. The y.output file correspond¬ 
ing to the above grammar (with some statistics stripped off the end) is: 
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f --- 

state 0 

$accept : 

_rhyme $end 


N 

DING shift 

error 

3 




rhyme goto 
sound goto 

1 

2 




state 1 

$accept : 

rhyme_$end 



$end accept 
error 




state 2 

rhyme : 

sound_place 



DELL shift 

error 

5 




place goto 4 




state 3 

sound : 

DING_DONG 



DONG shift 

. error 

6 




state 4 

rhyme : sound 

place_ 

(1) 


. reduce 

1 




state 5 

place : DELL_ 

(3) 



reduce 

3 




state 6 

sound : 

DING 

DONG__ 

(2) 


reduce 

2 



_ ^ _ > 


Notice that, in addition to the actions for each state, there is a description of the 
parsing rules being processed in each state. The _ character is used to indicate 
what has been seen, and what is yet to come, in each rule. Suppose the input is 

DING DONG DELL 

It is instmctive to follow the steps of the parser while processing this input. 

Initially, the current state is state 0. The parser needs to refer to the input in 
order to decide between the actions available in state 0, so the first token, DING, 
is read, becoming the lookahead token. The action in state 0 on DING is ‘shift 
3’, so state 3 is pushed onto the stack, and the lookahead token is cleared. State 3 
becomes the current state. The next token, DONG, is read, becoming the looka¬ 
head token. The action in state 3 on the token DONG is ‘shift 6’, so state 6 is 
pushed onto the stack, and the lookahead is cleared. The stack now contains 0, 3, 
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and 6. In state 6, without even consulting the lookahead, the parser reduces by 
rule 2. 


f - 



A 


sound : 

DING DONG 


^ _ 



j 


This rule has two symbols on the right hand side, so two states, 6 and 3, are 
popped off the stack, uncovering state 0. Consulting the description of state 0, 
looking for a goto on sound, 

sound goto 2 

V--—..j 

is obtained; thus state 2 is pushed onto the stack, becoming the current state. 

In state 2, the next token, DELL, must be read. The action is ‘shift 5’, so state 5 
is pushed onto the stack, which now has 0, 2, and 5 on it, and the lookahead 
token is cleared. In state 5, the only action is to reduce by mle 3. This has one 
symbol on the right hand side, so one state, 5, is popped off, and state 2 is 
uncovered. The goto in state 2 on place, the left side of rule 3, is state 4. Now, 
the stack contains 0, 2, and 4. In state 4, the only action is to reduce by rule 1. 
There are two symbols on the right, so the top two states are popped off, uncov¬ 
ering state 0 again. In state 0, there is a goto on rhyme causing the parser to enter 
state 1. In state 1, the input is read; the endmarker is obtained, indicated by 
‘Send’ in the y.output file. The action in state 1 when the endmarker is seen is 
to accept, successfully ending the parse. 

The reader is urged to consider how the parser works when confronted with such 
incorrect strings as DING DONG DONG, DING DONG, DING DONG DELL 
DELL, and SO on. A few minutes spend with this and other simple examples will 
probably be repaid when problems arise in more complicated contexts. 

11.5. Ambiguity and A set of grammar rules is ambiguous if there is some input string that can be 

Conflicts structured in two or more different ways. For example, the grammar rule 

expr : expr ' expr 

is a natural way of expressing the fact that one way of forming an arithmetic 
expression is to put two other expressions together with a minus sign between 
them. Unfortunately, this grammar rule does not unambiguously specify the way 
that all complex inputs should be structured. For example, if the input is 

expr — expr — expr 

the mle allows this input to be stmctured as either 

( expr - expr ) - expr 


or as 


expr - ( expr - expr ) 

The first is called left association, the second right association. 

yacc detects such ambiguities when it is attempting to build the parser. It is 
instmctive to consider the problem that confronts the parser when it is given an 
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input such as 

expr - expr - expr 

When the parser has read the second expr, the input that it has seen: 
expr - expr 

matches the right side of the grammar rule above. The parser could reduce the 
input by applying this rule; after applying the rule; the input is reduced to expr 
(the left side of the rule). The parser would then read the final part of the input: 

- expr 

and again reduce. The effect of this is to take the left-associative interpretation. 
Alternatively, when the parser has seen 
expr - expr 

it could defer the immediate application of the rule, and continue reading the 
input until it had seen 

expr - expr - expr 

It could then apply the rule to the rightmost three symbols, reducing them to expr 
and leaving 

expr — expr 

Now the rule can be reduced once more; the effect is to take the right associative 
interpretation. Thus, having read 

expr - expr 

the parser can do two legal things, a shift or a reduction, and has no way of 
deciding between them. This is called a shift / reduce conflict. It may also hap¬ 
pen that the parser has a choice of two legal reductions; this is called a reduce / 
reduce conflict. Note that there are never any ‘shift/shift’ conflicts. 

When there are shift/reduce or reduce/reduce conflicts, yacc still produces a 
parser. It does this by selecting one of the valid steps wherever it has a choice. 

A rule describing which choice to make in a given situation is called a disambi¬ 
guating rule. 

yacc invokes two disambiguating rules by default: 

1. In a shift/reduce conflict, the default is to do the shift 

2. In a reduce/reduce conflict, the default is to reduce by the earlier grammar 
rule (in the input sequence). 

Rule 1 implies that reductions are deferred whenever there is a choice, in favor of 
shifts. Rule 2 gives the programmer rather crude control over the behavior of the 
parser in this situation, but reduce/reduce conflicts should be avoided whenever 
possible. 

Conflicts may arise because of mistakes in input or logic, or because the gram¬ 
mar rules, while consistent, require a more complex parser than yacc can con¬ 
struct. The use of actions within rules can also cause conflicts, if the action must 
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be done before the parser can be sure which rule is being recognized. In these 
cases, the application of disambiguating rules is inappropriate, and leads to an 
incorrect parser. For this reason, yacc always reports the number of shift/reduce 
and reduce/reduce conflicts resolved by Rule 1 and Rule 2. 

In general, whenever it is possible to apply disambiguating rules to produce a 
correct parser, it is also possible to rewrite the grammar rules so that the same 
inputs are read but there are no conflicts. For this reason, most previous parser 
generators have considered conflicts to be fatal errors. Our experience has sug¬ 
gested that this rewriting is somewhat unnatural, and produces slower parsers; 
thus, yacc will produce parsers even in the presence of conflicts. 


As an example of the power of disambiguating rules, consider a fragment from a 
programming language involving an ‘if-then-else’ construction: 



stat 


IF 

' ( " cond 

') ' stat 




1 

IF 

' ( ' cond 

') ' stat ELSE stat 








_ J 


In these mles, IF and ELSE are tokens, cond is a nonterminal symbol describing 
conditional (logical) expressions, and stat is a nonterminal symbol describing 
statements. The flrst rule will be called the simple-if rule, and the second the if- 
else rule. 


These two rules form an ambiguous construction, since input of the form: 






-N 


IF 

( condition -1 ) 

IF ( condition -2 ) statement -1 ELSE statement -2 






. j 


can be structured according to these rules in two ways: 



IF ( 

condition 

-1 ) { 


> 


} 

ELSE 

IF ( 

condition -2 ) 

statement 



Statement 

-2 



V_ 





. ^ 

or 


IF ( 

condition 

-1 ) { 





IF ( 

condition -2 ) 

statement -1 



) 

ELSE 

statement -2 



V 




_/ 


The second interpretation is the one given in most programming languages hav¬ 
ing this construct. Each ELSE is associated with the last preceding 'un-ELSE’d' 
IF. In this example, consider the situation where the parser has seen 

IF ( condition -1 ) IF ( condition -2 ) statement -1 

and is looking at the ELSE. It can immediately reduce by the simple-if rule to 
get 
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IF 


IF ( condition -1 ) stat 

and then read the remaining input, 

ELSE Statement -2 
and reduce 

IF ( condition -1 ) stat ELSE statement -2 

by the if-else rule. This leads to the first of the above groupings of the input. 

On the other hand, the ELSE may be shifted, statement-2 read, and then the right 
hand portion of 

condition -1 ) IF ( condition -2 ) statement -1 ELSE statement -2 

— --- --- ------ ^ 

can be reduced by the if-else rule to get 

IF ( condition -1 ) stat 


which can be reduced by the simple-if mle. This leads to the second of the above 
groupings of the input, which is usually desired. 

Once again the parser can do two valid things - there is a shift/reduce conflict. 
The application of disambiguating rule 1 tells the parser to shift in this case, 
which leads to the desired grouping. 

This shift/reduce conflict arises only when there is a particular current input sym¬ 
bol, ELSE, and particular inputs already seen, such as 

IF ( condition -1 ) IF ( condition -2 ) statement -1 

In general, there may be many conflicts, and each one will be associated with an 
input symbol and a set of previously read inputs. The previously read inputs are 
characterized by the state of the parser. 

The conflict messages of yacc are best understood by examining the verbose 
(-v) option output file. For example, the output corresponding to the above 
conflict state might be: 


23: shift/reduce conflict (shift 45, reduce 18) on ELSE 
state 23 

stat : IF ( cond ) stat_ (18) 

stat : IF ( cond ) stat_ELSE stat 

ELSE shift 45 

reduce 18 


The first line describes the conflict, giving the state and the input symbol. The 
ordinary state description follows, giving the grammar rules active in the state, 
and the parser actions. Recall that the underline marks the portion of the gram¬ 
mar rules which has been seen. Thus in the example, in state 23 the parser has 
seen input corresponding to 


»SUI1 
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IF ( cond ) stat 

and the two grammar mles shown are active at this time. The parser can do two 
possible things. If the input symbol is ELSE, it is possible to shift into state 45. 
State 45 will have, as part of its description, the line 

stat : IF ( cond ) stat ELSE_stat 

since the ELSE will have been shifted in this state. Back in state 23, the alterna¬ 
tive action, described by is to be done if the input symbol is not mentioned 
explicitly in the above actions; thus, in this case, if the input symbol is not ELSE, 
the parser reduces by grammar rule 18: 

stat : IF cond ')' stat 

Once again, notice that the numbers following ‘shift’ commands refer to other 
states, while the numbers following ‘reduce’ commands refer to grammar rule 
numbers. In the y.output file, the rale numbers are printed after those rales 
which can be reduced. In most states, there will be at most one reduce action 
possible in the state, and this will be the default command. Programmers who 
encounter unexpected shift/reduce conflicts will probably want to look at the ver¬ 
bose output to decide whether the default actions are appropriate. In really tough 
cases, the programmer might need to know more about the behavior and con¬ 
struction of the parser than can be covered here. In this case, one of the theoreti¬ 
cal references cited in Chapter 1 might be consulted. 

11.6. Precedence There is one common situation where the rules given above for resolving 

conflicts ate not sufficient; this is in the parsing of arithmetic expressions. Most 
of the commonly used constructions for arithmetic expressions can be naturally 
described by the notion of precedence levels for operators, together with infor¬ 
mation about left or right associativity. It turns out that ambiguous grammars 
with appropriate disambiguating rales can be used to create parsers that are faster 
and easier to write than parsers constructed from unambiguous grammars. The 
basic notion is to write grammar rules of the form 

expr : expr OP expr 
and 


expr ; UNARY expr 

for all binary and unary operators desired. This creates a very ambiguous gram¬ 
mar, with many parsing conflicts. As disambiguating rales, the programmer 
specifies the precedence, or binding strength, of all the operators, and the associa¬ 
tivity of the binary operators. This information is sufficient to allow yacc to 
resolve the parsing conflicts in accordance with these rales, and construct a 
parser that realizes the desired precedences and associativities. 

The precedences and associativities are attached to tokens in the declarations sec¬ 
tion. This is done by a series of lines beginning with a yacc keyword: %lef t, 
%right, or %nonassoc, followed by a list of tokens. All of the tokens on the 
same line are assumed to have the same precedence level and associativity; the 
lines are listed in order of increasing precedence or binding strength. Thus, 
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%left 

N 


%left '/' 




—---> 


describes the precedence and associativity of the four arithmetic operatore. Plus 
and minus are left-associative, and have lower precedence than star and slash, 
which are also left-associative. The keyword % right is used to describe right- 
associative operators, and the keyword %nonassoc is used to describe opera¬ 
tors, like the . LT . operator in FORTRAN, that may not associate with them¬ 
selves; thus. 


A .LT. B .LT. C 


is illegal in FORTRAN, and such 

an operator would be described with the key- 

word -snonassoc in yacc. As an example of the behavior of these declara¬ 
tions, the description 

%right "=' 

%left 

%left '/' 


%% 

expr ; expr 

"=' expr 

1 expr 

expr 

1 expr 

' expr 

1 expr 

' expr 

1 expr 

1 NAME 

/ 

' / ' expr 

—- ----^ 


might be used to structure the input 


a = b = c*d - e - f+g 
as follows: 

a = ( b = ( ((c*d)-e) - (f*g) ) ) 

When this mechanism is used, unary operators must, in general, be given a pre¬ 
cedence. Sometimes a unary operator and a binary operator have the same sym¬ 
bolic representation, but different precedences. An example is unary and binary 
— ; unary minus may be given the same strength as multiplication, or even 
higher, while binary minus has a lower strength than multiplication. The key¬ 
word %prec changes the precedence level associated with a particular gr amma r 
rule. %prec appears immediately after the body of the grammar mle, before the 
action or closing semicolon, and is followed by a token name or literal. It 
changes the precedence of the grammar mle to become that of the following 
token name or literal. For example, to make unary minus have the same pre¬ 
cedence as multiplication the mles might resemble: 
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%left 


✓ / 




%left 






%% 






expr 

: 

expr 


expr 



1 

expr 


expr 



1 

expr 


expr 



1 

expr 

'r 

expr 



1 


expr 

%prec 



1 

NAME 





f 





A token declared by %lef t, %right, and %nonassoc need not be, but may 
be, declared by %token as well. 

The precedences and associativities are used by yacc to resolve parsing 
conflicts; they give rise to disambiguating rules. Formally, the rules work as fol¬ 
lows: 

1. The precedences and associativities are recorded for those tokens and literals 
that have them. 

2. A precedence and associativity is associated with each grammar rule; it is 
the precedence and associativity of the last token or literal in the body of the 
rule. If the %prec constmction is used, it overrides this default. Some 
grammar rules may have no precedence and associativity associated with 
them. 

3. When there is a reduce/reduce conflict, or there is a shift/reduce conflict and 
either the input symbol or the grammar mle has no precedence and associa¬ 
tivity, then the two disambiguating mles given at the beginning of the sec¬ 
tion are used, and the conflicts are reported. 

4. If there is a shift/reduce conflict, and both the grammar rule and the input 
character have precedence and associativity associated with them, then the 
conflict is resolved in favor of the action (shift or reduce) associated with the 
higher precedence. If the precedences are the same, then the associativity is 
used; left-associative implies reduce, right-associative implies shift, and 
nonassociating implies error. 

Conflicts resolved by precedence are not counted in the number of shift/reduce 
and reduce/reduce conflicts reported by yacc. This means that mistakes in the 
specification of precedences may disguise errors in the input grammar; it is a 
good idea to be sparing with precedences, and use them in an essentially ‘cook¬ 
book’ fashion, until some experience has been gained. The y.output file is very 
useful in deciding whether the parser is actually doing what was intended. 
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11.7. Error Handling Eiror handling is an extremely difficult area, and many of the problems are 

semantic ones. When an error is found, for example, it may be necessary to 
reclaim parse tree storage, delete or alter symbol table entries, and, typically, set 
switches to avoid generating any further output. 

It is seldom acceptable to stop all processing when an error is found; it is more 
useful to continue scanning the input to find further syntax errors. This leads to 
the problem of getting the parser ‘restarted’ after an error. A general class of 
algorithms to do this involves discarding a number of tokens from the input 
string, and attempting to adjust the parser so that input can continue. 

To allow the programmer some control over this process, yacc provides a sim¬ 
ple, but reasonably general, feature. The token name ‘error’ is reserved for error 
handling. This name can be used in grammar rules; in effect, it suggests places 
where errors are expected, and recovery might take place. The parser pops its 
stack until it enters a state where the token ‘error’ is legal. It then behaves as if 
the token ‘error’ were the current lookahead token, and performs the action 
encountered. The lookahead token is then reset to the token that caused the error. 
If no special error rules have been specified, the processing halts when an error is 
detected. 

In order to prevent a cascade of error messages, the parser, after detecting an 
error, remains in error state until three tokens have been successfully read and 
shifted. If an error is detected when the parser is already in error state, no mes¬ 
sage is given, and the input token is quietly deleted. 

As an example, a rule of the form 

stat : error 

would, in effect, mean that on a syntax error the parser would attempt to skip 
over the statement in which the error was seen. More precisely, the parser will 
scan ahead, looking for three tokens that might legally follow a statement, and 
start processing at the first of these; if the beginnings of statements are not 
sufficiently distinctive, it may make a false start in the middle of a statement, and 
end up reporting a second error where there is in fact no error. 

Actions may be used with these special error rules. These actions might attempt 
to reinitialize tables, reclaim symbol table space, etc. 

Error rules such as the above are very general, but difficult to control. Somewhat 
easier are rules such as 

stat : error ' 

Here, when there is an error, the parser attempts to skip over the statement, but 
will do so by skipping to the nextAll tokens after the error and before the 
nextcannot be shifted, and are discarded. When the is seen, this rule will 
be reduced, and any ‘cleanup’ action associated with it performed. 

Another form of error rule arises in interactive applications, where it may be 
desirable to permit a line to be reentered after an error. A possible error rule 
might be 
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f - 

input 

. 

error '\n' 

{ printf( "Reenter last line: " ); } input 


V_ 


{ 

$$ = $4; } 

J 


There is one potential difficulty with this approach; the parser must correctly pro¬ 
cess three input tokeiis before it admits that it has correctly resynchronized after 
the error. If the reentered line contains an error in the first two tokens, the parser 
deletes the offending tokens, and gives no message; this is clearly unacceptable. 
For this reason, there is a mechanism that can be used to force the parser to 
believe that an error has been fully recovered from. The statement 

yyerrok ; 


in an action resets the parser to its normal mode. The last example is better writ¬ 
ten 








- 

input 

: 

error 

'\n' 








yyerrok; 

printf( "Reenter last line; " ); 

} 




input 

{ 

$$ = $4; } 



V _ 






-J 


As mentioned above, the token seen immediately after the ‘error’ symbol is the 
input token at which the error was discovered. Sometimes, this is inappropriate; 
for example, an error recovery action might take upon itself the job of finding the 
correct place to resume input. In this case, the previous lookahead token must be 
cleared. The statement 

yyclearirr ; 

in an action will have this effect. For example, suppose the action after error 
were to call some sophisticated resynchronization routine, supplied by the pro¬ 
grammer, drat attempted to advance the input to the beginning of the next valid 
statement. After this routine was called, the next token returned by yylex () 
would presumably be the first token in a legal statement; the old, illegal token 
must be discarded, and the error state reset. This could be done by a rule like 


r 







stat 


error 

{ 

resynch(); 
yyerrok ; 
yyclearin ; 

} 








J 


These mechanisms are admittedly crude, but do allow for a simple, fairly effec¬ 
tive recovery of the parser from many errors; moreover, the programmer can get 
control to deal with the error actions required by other portions of the program. 
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11.8. The yacc 
Environment 


i 

I 


o 


11.9. Hints for Preparing 
Specifications 



When the programmer inputs a specification to yacc, the output is a file of C 
programs, called yjab.c on most systems (due to local file system conventions, 
the name may differ from installation to installation), yacc produces an 
integer-valued fimction called yyparse (). When yyparse () is called, it in 
turn repeatedly calls yylex () — the lexical analyzer supplied by the program¬ 
mer (see Section 11.3) to obtain input tokens. Eventually, either an error is 
detected, in which case (if no error recovery is possible) yyparse () returns the 
value 1, or the lexical analyzer returns the endmarker token and the parser 
accepts. In this case, yyparse 0 returns the value 0. 

The programmer must provide a certain amount of environment for this parser in 
order to obtain a working program. For example, as with every C program, a 
program called main must be defined, that eventually calls yyparse (). In 
addition, a routine called yyerror () prints a message when a syntax error is 
detected. 


The programmer must supply these two routines in one form or another. They 
can be as simple as the following example, or they can be as complex as needed. 



main () { 

return( yyparse() ); 

} 


and 


# include <stdio.h> 

A 


yyerror(s) char *s; { 

fprintf ( stderr, ’’%s\n", s ); 

) 

-/ 


The arpment to yyerror () is a string containing an error message, usually 
the string ‘syntax error’. The average application will want to do better than this. 
Ordinarily, the program should keep track of the input line number, and print it 
along with the message when a syntax error is detected. The external integer 
variable yychar contains the lookahead token number at the time the error was 
detected; this may be of some interest in giving better diagnostics. 

The external integer variable yy debug is normally set to 0. If it is set to a 
nonzero value, the parser generates a verbose description of its actions, including 
a discussion of which input symbols have been read, and what the parser actions 
are. Depending on the operating environment, it may be possible to set this vari¬ 
able by using a debugging system. 

This section contains miscellaneous hints on preparing efficient, easy to change, 
and clear specifications. The individual subsections are more or less indepen¬ 
dent. 
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Input Style 


Left Recursion 


It is difficult to provide rules with substantial actions and still have a readable 
specification file. The following style hints owe much to Brian Kemighan. 

1. Use all capital letters for token names, all lower case letters for nonterminal 
names. This rule comes under the heading of ‘knowing who to blame when 
things go wrong.’ 

2. Put grammar rules and actions on separate lines. This allows either to be 
changed without an automatic need to change the other. 

3. Put all rules with the same left hand side together. Put the left hand side in 
only once, and let all following rules begin with a vertical bar. 

4. Put a semicolon only after the last rule with a given left hand side, and put 
the semicolon on a separate line. This allows new rules to be added easily. 

5. Indent rule bodies by two tab stops, and action bodies by three tab stops. 

The example in section 11.11 is written following this style, as are the examples 
in the text of this paper (where space permits). The programmer must make up 
his own mind about these stylistic questions; the central problem, however, is to 
make the mles visible through the morass of action code. 

The algorithm used by the yacc parser encourages so called ‘left-recursive’ 
grammar rules: rules of the form 

name : name rest of rule ; 


These rules frequently arise when writing specifications of sequences and lists: 


list 

— 

1 

f 

item 

list " item 

J 

and 

r 



V 

seq 

: 

item 



1 

seq item 


v_ 



J 


In each of these cases, the first rule will be reduced for the first item only, and the 
second rule will be reduced for the second and all succeeding items. 


With right-recursive rules, such as 


r 

\ 

seq : item 

1 item seq 

t 

J 


the parser would be a bit bigger, and the items would be seen, and reduced, from 
right to left. More seriously, an internal stack in flie parser would be in danger of 
overflowing if a very long sequence were read. Thus, the programmer should use 
left recursion wherever reasonable. 
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Lexical Tie-ins 


It is worth considering whether a sequence with zero elements has any meaning, 
and if so, consider writing the sequence specification with an empty rule: 


r -- 



A 

seq 

• 

/* empty */ 



1 

seq item 


V 



—-/ 


Once again, the first mle would always be reduced exactly once, before the first 
item was read, and then the second rule would be reduced once for each item 
read. Permitting empty sequences often leads to increased generality. However, 
conflicts might arise if yacc is asked to decide which empty sequence it has 
seen, when it hasn’t seen enough to know! 

Some lexical decisions depend on context. For example, the lexical analyzer 
might want to delete blanks normally, but not within quoted strings. Or names 
might be entered into a symbol table in declarations, but not in expressions. 


One way of handling this situation is to create a global flag that is examined by 
the lexical analyzer, and set by actions. For example, suppose a program consists 
of 0 or more declarations, followed by 0 or more statements. Consider: 


- - 

%{ 



A 

%} 

int 

dflag; 



other declarations 


%% 




prog 

• 

deals stats 


deals 

: 

/* empty ♦/ 



1 

{ dflag =1; } 

decls declaration 


stats 


/♦ empty ♦/ 



1 

{ dflag =0; } 

stats statement 



other rules 







The flag dflag is now 0 when reading statements, and 1 when reading declara¬ 
tions, except for the first token in the first statement This token must be seen by 
the parser before it can tell that the declaration section has ended and the state¬ 
ments have begun. In many cases, this single-token exception does not affect the 
lexical scan. 

This kind of ‘backdoor’ approach can be elaborated to a noxious degree. 
Nevertheless, it represents a way of doing some things that are difficult, if not 
impossible, to do otherwise. 
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Reserved Words 


11.10. Advanced Topics 

Simulating Error and Accept 
in Actions 

Accessing Values in Enclosing 
Rules. 


Some programming languages permit the programmer to use words like ‘if, 
which are normally reserved, as label or variable names, provided that such use 
does not conflict with the legal use of these names in the programming language. 
This is extremely hard to do in the framework of yacc; it is difficult to pass 
information to the lexical analyzer telling it ‘this instance of if is a keyword, 
and that instance is a variable’. The programmer can make a stab at it, using the 
mechanism described in the last subsection, but it is difficult. 

A number of ways of making this easier are under advisement Until then, it is 
better that the keywords be reserved-, that is, be forbidden for use as variable 
names. There are powerful stylistic reasons for preferring this, anyway. 

This section discusses a number of advanced features of yacc. 

The parsing actions of error and accept can be simulated in an action by use of 
macros YYACCEPT and YYERROR. YYACCEPT makes yyparse return the 
value 0; YYERROR makes the parser behave as if the current input symbol results 
in a syntax error; yyerror () is called, and error recovery takes place. These 
mechanisms can be used to simulate parsers with multiple endmarkers or 
context-sensitive syntax checking. 

An action may refer to values returned by actions to the left of the current rule. 
The mechanism is simply the same as with ordinary actions, a dollar sign fol¬ 
lowed by a digit, but in this case the digit may be 0 or negative. Consider 









sent 

: adj 

noun verb adj noun 




{ 

look at the sentence ... } 



adj : 

THE 

{ 

$$ = THE; } 



1 

YOUNG 

{ 

$$ = YOUNG; } 



f 

noun 

: DOG 






{ 

$$ 

= DOG; } 



1 

CRONE 






{ 

if 

( $0 == YOUNG ){ 






printf( ”what?\n’’ ); 






} 





$$ 

= CRONE; 





} 




f 










J 


In the action following the word CRONE, a check is made that the preceding 
token shifted was not YOUNG. Obviously, this is only possible when a great deal 
is known about what might precede the symbol noun in the input. There is also a 
distinctly unstructured flavor about (his. Nevertheless, at times this mechanism 
will save a great deal of trouble, especially when a few combinations are to be 
excluded from an otherwise regular structure. 


•sun 
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Support for Arbitrary Value By default, the values returned by actions and the lexical analyzer are integers. 

'■ can also support values of other types, including structures. In addition, 

yacc keeps track of the types, and inserts appropriate union member names so 
that die resulting parser will be strictly type checked. The yacc value stack (see 
Section 11.4) is declared to be a union of the various types of values desired. 
The programmer declares the union, and associates a union member name to 
each token and nonterminal symbol having a value. When the value is refer¬ 
enced through a $$ or $n constmction, yacc automatically inserts the appropri¬ 
ate union name, so that no unwanted conversions will take place. In addition, 
type-checking commands such as lint(l) will be far more silent. 

There are three mechanisms used to provide for this typing. First, there is a way 
of defining the union; this must be done by the programmer since other pro¬ 
grams, notably the lexical analyzer, must know about the union member names. 
Second, there is a way of associating a union member name with tokens and non¬ 
terminals. Finally, there is a mechanism for describing the type of those few 
values where yacc cannot easily determine the type. 


To declare the union, the programmer includes in the declaration section: 


/---—— -- - 


%union { 


body of union ... 


} 



__ / 


This declares the yacc value stack, and the external variables yylval and 
yyval, to have type equal to this union. If yacc was invoked with the -d 
option, the union declaration is copied onto they.tab Ji file. Alternatively, the 
union may be declared in a header file, and a typedef used to define the variable 
YYSTYPE to represent this union. Thus, the header file might also have said: 


/-—--- 

typedef union { 


body of union ... 


} YYSTYPE; 





The header file must be included in the declarations section, by use of %{ and 

%}. 


Once YYSTYPE is defined, the union member names must be associated with the 
various terminal and nonterminal names. The constmction 

< name > 

is used to indicate a union member name. If this follows one of the keywords 
%token, %left, %right, and %nonassoc, the union member name is asso¬ 
ciated with the tokens listed. Thus, saying 
--- 

%left <optype> 


will tag any reference to values returned by these two tokens with the union 
member name optype. Another keyword, %type, is used similarly to associate 
union member names with nonterminals. Thus, one might say 
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%type <nodetype> expr stat 

_ / 


There remain a couple of cases where these mechanisms are insufficient. If there 
is an action within a rule, the value, returned by this action has no a priori type. 
Similarly, reference to left-context values (such as $0 — see the previous subsec¬ 
tion) leaves yacc with no easy way of knowing the type. In this case, a type can 
be imposed on the reference by inserting a union member name, between < and 
>, immediately after the first $. An example of this usage is 









rule 

; 

aaa { $<intval>$ 

=3; } bbb 





{ fun( $<intval>2^ 

$<other>0 ); } 



t 






This syntax has little to recommend it, but the situation arises rarely. 

A sample specification is given in 11.13. The facilities in this subsection are not 
triggered until they are used: in particular, the use of %type will turn on these 
mechanisms. When they are used, there is a fairly strict level of checking. For 
example, use of $n or $$ to refer to something with no defined type is diagnosed. 
If these facilities are not triggered, the yacc value stack is used to hold int’s, 
as was tme historically. This paper is reprinted in this manual. 

11.11. A Simple Example This example gives the complete yacc specification for a small desk calculator; 

the desk calculator has 26 registers, labeled ‘a’ through ‘z’, and accepts arith¬ 
metic expressions made up of the operators +, -, *, /, % (mod operator), & (bit¬ 
wise and), I (bitwise or), and assignment. If an expression at the top level is an 
assignment, the value is not printed; otherwise it is. As in C, an integer that 
begins with 0 (zero) is assumed to be octal; otherwise, it is assumed to be 
decimal. 

As an example of a yacc specification, the desk calculator does a reasonable job 
of showing how precedences and ambiguities are used, and demonstrating simple 
error recovery. The major oversimplifications are that the lexical analysis phase 
is much simpler than for most applications, and the output is produced immedi¬ 
ately, line-by-line. Note the way that decimal and octal integers are read in by 
the grammar rules; This job is probably better done by the lexical analyzer. 



%{ 


A 


# include <stdio.h> 

# include <ctype.h> 



int regs[26]; 
int base; 



%} 




%start 

list 



%token 

DIGIT LETTER 



%left 

' 1' 



%left 



Ml 





W sun 

Xr microsystems 
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%left '+' 





"" N 

%left 






%left UMINUS 

/* supplies 

precedence for unary minus 

*/ 



%% /* beginning of rules 

section */ 




list : 

/* empty */ 




1 

list stat 

'\n' 




1 

list error '\n' 




/ 

{ 

yyerrok; } 




stat : 

expr 






{ 

printf( ”%d\n", $1 )/ 

} 



1 

LETTER '= 

expr 





{ 

regs[$l] = $3; } 




expr : 

' ( ' expr 

') ' 





{ 

$$ = $2; } 




1 

expr 

expr 





{ 

$$ = $1 + $3; } 




1 

expr ' 

expr 





{ 

$$ = $1 - $3; } 




1 

expr 

expr 





{ 

$$ = $1 * $3; } 




1 

expr '/' 

expr 





{ 

$$ = $1 / $3; } 




1 

expr 

expr 





{ 

$$ = $1 % $3; } 




1 

expr 'St' 

expr 





{ 

$$ = $!& $3; } 




1 

expr '1' 

expr 





{ 

$$ = $1 1 $3; ) 




1 

' expr 

%prec UMINUS 





{ 

$$ = - $2; } 




1 

LETTER 






{ 

$$ = regs[$l]; } 




1 

number 





number : 

DIGIT 






{ 

$$ = $1; base = ($1= 

=0) ? 8 : 10; } 


1 

number DIGIT 



f 

{ 

$$ = base * $1 + $2; 

} 


%% /* Start 

of programs */ 




yylexo 

/* lexical analysis routine */ 




/* returns LETTER for lower 

case letter, yylval=0 thru 

25 

*/ 


/* return DIGIT for digit, yylval=0 thru 9 */ 




/* all other characters are 

returned immediately */ 




int 

c; 





while ((c = getcharO) ==''){/♦ skip blanks 

V 

-- 

*/ 

} 
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def 


START IDENTIFIER -----\ 


1 

UNION { Copy union definition to output } 


1 

LCURL { Copy C code to output file } RCURL 


1 

ndefs rword tag nlist 

rword 


TOKEN 


1 

LEFT 


1 

RIGHT 


1 

NONASSOC 


1 

f 

TYPE 

tag 


/+ empty: union tag is optional */ 


1 

'<' IDENTIFIER '>-■ 

nlist 


nmno 


1 

nlist nmno 


1 

t 

nlist ' nmno 

nmno 


IDENTIFIER /* NOTE: literal illegal with %type */ 


1 

r 

IDENTIFIER NUMBER /* NOTE: illegal with %type */ 


/* 

rules section */ 

rules 


C_IDENTIFIER rbody prec 


1 

/ 

rules rule 

rule 


C_IDENTIFIER rbody prec 


1 

/ 

'1' rbody prec 

rbody 


/* empty */ 


1 

rbody IDENTIFIER 


1 

f 

rbody act 

act 

/ 

' { ' { Copy action, translate $$, etc. } '} ' 

prec 


/* empty */ 


1 

PREC IDENTIFIER 


1 

PREC IDENTIFIER act 


1 

prec ';' 



~ ■“ ---------- ^ 


11.13. An Advanced 
Example 


Tliis section gives an example of a grammar using some of the advanced features 
discussed in Section 11.10. The desk calculator example in section 11.11 is 
modified to provide a desk calculator that does floating point interval arithmetic. 
The calculator understands floating point constants, the arithmetic operations +, 
*) /> unary —, and = (assignment), and has 26 floating point variables, ‘a’ 
through ‘z’. Moreover, it also understands intervals, written 
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( X , y ) 

where x is less than or equal to y. There are 26 interval-valued variables ‘A’ 
through ‘Z’ that may also be used. The usage is similar to that in section 11.11 
— assignments return no value, and print nothing, while expressions print the 
(floating or interval) value. 

This example explores a number of interesting features of yacc and C. Intervals 
are represented by a structure, consisting of the left and right endpoint values, 
stored as double'^. This structure is given a type name, INTERVAL, by using 
typedef. 

The yacc value stack can also contain floating point scalars, and integers (used 
to index into the arrays holding the variable values). Notice that this entire stra¬ 
tegy depends strongly on being able to assign structures and unions in C. In fact, 
many of the actions call functions that return structures as well. 

It is also worth noting the use of YYERROR to handle error conditions: division 
by an interval containing 0, and an interval presented in the wrong order. In 
effect, the error recovery mechanism of yacc is used to throw away the rest of 
the offending line. 

In addition to the mixing of types on the value stack, this grammar also demon¬ 
strates an interesting use of syntax to keep track of the type (for example, scalar 
or interval) of intermediate expressions. Note that a scalar can be automatically 
promoted to an interval if the context demands an interval-value. This causes a 
large number of conflicts when the grammar is run through yacc: 18 
Shift/Reduce and 26 Reduce/Reduce. The problem can be seen by looking at the 
two input lines: 

2.5 + ( 3.5 - 4. ) 

and 

2.5 + ( 3.5 , 4. ) 

Notice that the 2.5 is to be used in an interval-valued expression in the second 
example, but this fact is not known until the is read; by this time, 2.5 is 
finished, and the parser cannot go back and change its mind. More generally, it 
might be necessary to look ahead an arbitrary number of tokens to decide 
whether to convert a scalar to an interval. This problem is evaded by having two 
rules for each binary interval-valued operator: one when the left operand is a 
scalar, and one when the left operand is an interval. In the second case, the right 
operand must be an interval, so the conversion will be applied automatically. 
Despite this evasion, there are still many cases where the conversion may be 
applied or not, leading to the above conflicts. They are resolved by listing the 
rules that yield scalars first in the specification file; in this way, the conflicts will 
be resolved in the direction of keeping scalar-valued expressions scalar-valued 
until they are forced to become intervals. 

This way of handling multiple types is very instructive, but not very general. If 
there were many kinds of expression types, instead of just two, the number of 
rules needed would increase dramatically, and the conflicts even more dramati¬ 
cally. Thus, while this example is instructive, it is better practice in a more 


microsystems 


Revision A of 9 May 1988 



Chapter 11 — yacc — Yet Another Compiler-Compiler 263 


u 


normal programming language environment to keep the type information as part 
of the value, and not as part of the grammar. 

Finally, a word about the lexical analysis. The only unusual feature is the treat¬ 
ment of floating point constants. The C library routine atofh used to do the 
actual conversion from a character string to a double-precision value. If the lexi¬ 
cal analyzer detects an error, it responds by returning a token that is illegal in the 
grammar, provoking a syntax error in the parser, and thence error recovery. 




%{ 

# include <stdio.h> 

# include <ctype.h> 

typedef struct interval { 
double lo, hi; 

} INTERVALE- 

INTERVAL vmul 0 E vdiv() ; 

double atofO; 

double dreg[ 2 6 ] ; 

INTERVAL vreg[ 26 ]; 

%} 

%start lines 

%union { 

int ival; 
double dval; 
INTERVAL vval; 

} 

%token <ival> DREG VREG 
%token <dval> CONST 
%type <dval> dexp 
%type <vval> vexp 


/* indices into dreg, vreg arrays */ 

/* floating point constant */ 
/* expression ♦/ 

/* interval expression ♦/ 


/* precedence information about the operators */ 


%left 





%left 





%left 

UMINUS 

/* precedence 

for unary minus */ 

%% 





lines 

: 

/* empty ♦/ 



1 

lines 

line 


line 

; 

dexp 

'\n' 






printf( "%15.8f\n". 


1 

vexp 

'\n' 





{ 

printf( "(%15.8f , 


1 

DREG 

'=' dexp 

'\n' 




{ 

dreg[$l] = $3; ) 


1 

VREG 

'=' vexp 

'\n' 




{ 

vreg[$l] = $3; } 


1 

error 

'\n' 





{ 

yyerrok; } 


} 


%15.8f )\n”, $l.lo, $l.hi ); } 


» sun 
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dexp 

f 

CONST 





1 

DREG 

{ 


$$ = 

dreg[$l]; } 


1 

dexp 


dexp 






{ 


$$ = 

$1 + $3; } 


1 

dexp 


dexp 






{ 


$$ = 

$1 - $3; ) 


1 

dexp 


dexp 








$$ = 

$1 + $3; } 


1 

dexp 


dexp 






{ 


$$ = 

$1 / $3; } 


1 


dexp 


%prec 

UMINUS 




{ 


$$ = 

- $2; } 


1 


dexp 

')' 






{ 


$$ = 

$2; ) 

vexp 

: 

dexp 

{ 


$$.hi 

= $$.lo = $1; } 


I 


dexp 

{ 

$5 

f 

dexp ') 





. lo 

* $2; 





$$.hi 

= $4; 





if ( $$ 

.lo > 

$$.hi ){ 






printf( "interval out of order\n" ); 

YYERROR; 

} 


1 

VREG 

} 







{ 


$$ - 

vreg[$l]; } 


1 

vexp 


vexp 






{ 


$$.hi 

= $l.hi + $3.hi; 






$$.lo 

” $l.lo + $3.1o; } 


1 

dexp 


vexp 






{ 


$$.hi 

“ $1 + $3.hi; 






$$.lo 

- $1 + $3.1o; } 


1 

vexp 


vexp 








$$.hi 

= $l.hi - $3.1o; 






$$.lo 

- $l.lo - $3.hi; } 


1 

dexp 


vexp 






K 


$$.hi 

= $1 - $3.1o; 






$$.lo 

= $1 - $3.hi; } 


1 

vexp 


vexp 






{ 


$$ - 

vinuK $l.lo, $l.hi, $3 ); } 


1 

dexp 


vexp 






{ 


$$ = 

vmuK $1, $1, $3 ); ) 


1 

vexp 

V' 

vexp 






{ 


if( dcheck( $3 > ) YYERROR; 






$$ = 

vdiv( $l.lo, $l.hi, $3 ); } 


1 

dexp 


vexp 






{ 


if ( dcheck ( $3 ) ) YYERROR; 






$$ = 

vdiv( $1, $1, $3 ); } 


1 


vexp 


%prec 

UMINUS 




{ 


$$.hi 

== --$2.1o; $$.lo = ~$2.hi; } 


1 


vexp 

')' 




/ 


{ 


$$ = 

$2; } 

_ __ J 
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# define BSZ 50 /* buffer size for floating point numbers */ 

/* lexical analysis ♦/ 

yylexO { 

register c; 

while ( (c=getchar0) ' ' ){ /* skip over blanks ♦/ } 

if( isupper( c ) ){ 

yylval.ival = c ~ 'A'; 

return ( VREG ); 

} 

if( islower( c ) ){ 

yylval.ival = c - 'a^• 

return ( DREG ); 

} 

if( isdigit( c ) |( c=='.' ){ 

/* gobble up digits, points, exponents ♦/ 

char buf[BSZ+l], *cp - buf; 
int dot = 0, exp = 0; 

; (cp*~buf) <BSZ ; ++cp, c=getchar () ){ 

♦cp = c; 

if( isdigit{ c ) ) continue; 

if( c -= '. ' ){ 

if( dot++ I I exp ) return ( ". " ); 

/* will cause syntax error */ 

continue; 

} 

if{ c == 'e" ){ 

if ( exp++ ) return( 'e ' ); 

/* will cause syntax error */ 

continue; 

} 

/♦ end of number ♦/ 
break; 

} 

*cp - '\0^• 

if( (cp“buf) >= BSZ ) 

printf ( "constant too long: truncated\n" ); 
else ungetc( c, stdin ); /* push back last char read */ 

yylval.dval = atof ( buf ); 
return( CONST ); 

} 

return ( c ); 

) 

INTERVAL hilo( a, b, c, d ) double a, b, c, d; { 

/* returns the smallest interval containing a, b, c, and d */ 

/* used by *, / routines */ 

INTERVAL v; 

if ( a>b ) { v.hi = a; v.lo = b; } 

else { v.hi - b; v.lo - a; } 

if ( c>d ) { 

if( c>v.hi ) v.hi = c; 
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if ( 

d<v.lo 

) 

V . lo 

= d; 

} 





if ( 

d>v.hi 

) 

v.hi 

= d; 

if ( 

c<v.lo 

) 

v.lo 

= c; 


else { 


} 

return ( v ) ; 

} 

INTERVAL vrnuK a, b, v ) double a, b; INTERVAL v; { 

return ( hilo( a*v.hi, a*v.lo, b*v,hi, b*v.lo ) ); 

} 

dcheck ( V ) INTERVAL v; { 

if ( v.hi >- 0. && v.lo <= 0. ){ 

printf ( "divisor interval contains OAn" ); 
return( 1 ); 

} 

return( 0 ); 

} 

INTERVAL vdiv( a, b, v ) double a, b; INTERVAL v; { 

return( hilo( a/v.hi, a/v.lo, b/v.hi, b/v.lo ) ); 

} 


11.14. Old Features 

Supported but not 
Encouraged 


This section mentions synonyms and features which are supported for historical 

continuity, but, for various reasons, are not encouraged. 

1. Literals may also be delimited by double quotes 

2. Literals may be more than one character long. If all the characters are alpha¬ 
betic, numeric, or _, the type number of the literal is defined, just as if the 
literal did not have the quotes around it. Otherwise, it is difficult to find the 
value for such literals. 

The use of multi-character literals is likely to mislead those unfamiliar with 
yacc, since it suggests that yacc is doing a job which must be actually 
done by the lexical analyzer. 

3. Most places where % is legal, backslash ‘\’ may be used. In particular, W is 
the same as %%, \left the same as %left, etc. 

4. There are a number of other synonyms: 


%< is the same as %left 
%> is the same as %right 

%binary and %2 are the same as %nonassoc 
%0 and %term are the same as %token 
%= is the same as %prec 


5. Actions may also have the form 

={...} 

and the curly braces can be dropped if the action is a single C statement. 
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6 . 


C code between %{ and %} used to be permitted at the head of the rules sec¬ 
tion, as well as in the declaration section. 
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Overview 


Terminology 



The curses Library; Screen-Oriented 

Cursor Motions 


curses is a Library Package for; 

n Updating a screen with reasonable optimization, 

a Getting input from the terminal in a screen-oriented fashion, and 

a Moving the cursor from one point to another, independent of the two previ¬ 
ous functions. 

These routines all use the termcap database to describe the capabilities of the 
terminal. 

In making available the generalized terminal descriptions in termcap, much 
information was made available to the programmer, but little work was taken out 
of one’s hands, curses helps the programmer perform the required functions, 
those of movement optimization and optimal screen updating, without doing any 
of the dirty work, and (hopefully) with nearly as much ease as is necessary to 
simply print or read things. 

The curses package is split into three parts; 

1. Screen updating without user input; 

2. Screen updating with user input; and 

3. Cursor motion optimization. 

It is possible to use the motion optimization without using either of the other 
two, and screen updating and input can be done without any programmer 
knowledge of the motion optimization, or indeed the termcap database itself. 

In this chapter, the terminology illustrated in the table below is used with reason¬ 
able consistency. 
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Table 12-1 Description of Terms 


Term Description 


window An internal representation containing an image of what a section 
of the terminal screen may look like at some point in time. This 
subsection can either encompass the entire terminal screen, or 
any smaller portion down to a single character within that screen. 
Note that the term window is used elsewhere in the Sun system 
manuals when describing the window management packages for 
driving the bitmapped screens, curses windows bear little, if 
any, resemblance to the window system concepts. 

terminal Sometimes called screen. The package’s idea of what 

the terminal’s screen currently looks like, that is, what the user 
sees now. This is a special screen: 

screen This is a subset of windows which are as large as the terminal 

screen, that is, they start at the upper left hand corner and encom¬ 
pass the lower right hand comer. One of these, stdscr, is 
automatically provided for the programmer. 


Cursor Addressing Conventions The curses library routines address positions on a screen with the y coordinate 

first and the x coordinate second. This follows the convention of most terminals 
that address the screen in row, column order. The reader should note this con- 
ventioa 

Compiling Things To use the curses library, it is necessary to have certain types and variables 

defined. Therefore, the progranuner must have a line: 


#include <curses.h> 


at the top of the program source.^^ 

Also, compilations should have the following form: 

tutorial% cc [ C-compiler options ] filename...—Icurs&s -Itermcap 


The header file <Gurses .h> needs to include <sgtty .h>, so one should not do so oneself. The 
screen package also uses the Standard I/O library, so <curses .h> includes <stdio .h>. It is redundant (but 
harmless) to inciude it again. 
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Screen Updating 


u 

Naming Conventions 


To update the screen optimally, it is necessaiy for the routines to know what the 
screen currently looks like and what the programmer wants it to look like next. 
For this purpose, a data type (structure) named window () is defined which 
describes a window image to the routines, including its starting position on the 
screen (the (y, x) coordinates of the upper left hand comer) and its size. One of 
these (called cursor for current screen) is a screen image of what the terminal 
currently looks like. Another screen (called st ds cr, for standard screen) is 
provided by default to make changes on. 

A window is a purely internal representation. It is used to build and store a 
potential image of a portion of the terminal. It doesn’t bear any necessary rela¬ 
tion to what is really on the terminal screen. It is more like an array of characters 
on which to make changes. 

When one has a window which describes what some part the terminal should 
look like, the routine ref resh () (orwrefreshO if the window is not 
stdscr) is called, refresh!) makes the terminal, in the area covered by the 
window, look like that window. Note, therefore, that changing something on a 
v.'indow does not change the terminal. Actual updates to the terminal screen are 
made only by calling refresh () or wrefresh (). This allows the program¬ 
mer to maintain several different ideas of what a portion of the terminal screen 
should look like. Also, changes can be made to windows in any order, without 
regard to motion efficiency. Then, at will, the programmer can effectively say 
‘make it look like this,’ and let the package worry about the best way to do this. 

As hinted above, the routines can use several windows, but two are automatically 
given: cursor, which knows what the terminal looks like, and stdscr, which 
is what the programmer wants the terminal to look like next. The user should 
never really access cursor directly. Changes should be made to the appropri¬ 
ate screen, and then the routine refresh () (or wref resh ()) should be 
called. 

Many functions are set up to deal with stdscr as a default screen. For exam¬ 
ple, to add a character to stdscr, one calls addch () with the desired charac¬ 
ter. If a different window is to be used, the routine waddch () (for “window- 
specific” addch ()) is provided^*^. This convention of prepending function 
names with a w when they are to be applied to specific windows is consistent. 

The only routines which do not do this are those to which a window must always 
be specified. 


^ Actually, addch () is really a macro with arguments, as are most of the "functions" which deal with 
stdscr as a default. 
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To move the current (y, x) coordinates from one point to another, the routines 
move () and wmove () are provided. However, it is often desirable to first 
move and then perform some I/O operation. To avoid clumsiness, most I/O rou¬ 
tines can be preceded by the prefix mv and the desired (y, x) coordinates then can 
be added to the arguments to the function. For example, the calls: 


move {Yr x); 


addch(ch); 


can be replaced by 


mvaddch(y. 

X, ch); 

and 


wmove(win^ 

y, X) ; 

waddch(win^ 

ch) ; 


can be replaced by 

mvwaddch(win, y, x, ch); 

Note that the window description pointer (win) comes before the added (y, x) 
coordinates. If such pointers are needed, they are always the first parameters 
passed. 

12.1. Variables Many variables that describe the terminal environment are available to the pro¬ 

grammer. They are: 

Table 12-2 Variables to Describe the Terminal Environment 


Type 

Name 

Description 

WINDOW * 

curscr 

current version of the screen (terminal screen). 

WINDOW * 

stdscr 

standard screen. Most updates are done here. 

char * 

Def_term 

default terminal type if type cannot be deter¬ 
mined 

bool 

My_term 

use the terminal specification in Def_term as 
terminal, irrelevant of real terminal type 

char * 

ttytype 

full name of the current terminal. 

int 

LINES 

number of lines on the terminal 

int 

COLS 

number of columns on the terminal 

int 

ERR 

error flag returned by routines on a fail. 

int 

OK 

error flag returned by routines when things go 
right. 
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There are also several #def ine constants and types which are of general useful¬ 
ness: 

reg storage class register (for example, reg int i;) 
bool booleantype, actually .a char (for example, bool doneit;) 

TRUE boolean ‘tme’ flag (1). 

FALSE boolean ‘false’ flag (0). 

12.2. Programming Curses This is a description of how to actually use the screen package. In it, we assume 

all updating, reading, and so on, is applied to stdscr. All instructions will 
work on any window, by changing the function name and parameters as men¬ 
tioned above. 

Starting Up To use the screen package, the routines must know about terminal characteristics, 

and the space for cursor and stdscr must be allocated. These functions are 
performed by init scr (). Since it must allocate space for the windows, it can 
overflow core when attempting to do so. On this rather rare occasion, 
initscrO returns ERR. initscr() must be called before any of 
the routines which affect windows are used. If it is not, the program will core 
dump as soon as either cursor or stdscr are referenced. However, it is usu¬ 
ally best to wait to call it until after you are sure you will need it, like after 
checking for startup errors. Terminal status changing routines like nl () and 
cbreak () should be called after init scr (). 

Now that the screen windows have been allocated, you can set them up for the 
run. If you want to, say, allow the window to scroll, use scrollok (). If you 
want the cursor to be left after the last change, use leaveok (). If this isn’t 
done, ref resh () moves the cursor to the window’s current (y, x) coordinates 
after updating it. New windows of your own can be created, too, by using the 
functions newwin () and subwin (). delwin () gets rid of old windows. If 
you wish to change the official size of the terminal by hand, just set the variables 
LINES and COLS to be what you want, and then call init scr (). This is best 
done before, but can be done either before or after, the first call to init scr (), 
as it always deletes any existing stdscr and/or cursor before creating new 
ones. 


The Nitty-Gritty 

Output Now that we have set things up, we will want to actually update the terminal. 

The basic functions used to change what appears on a window are addch () and 
move (). addch () adds a character at the current (y, x) coordinates, returning 
ERR if it would cause the window to illegally scroll, that is, printing a character 
in the lower right-hand comer of a terminal which automatically scrolls if scrol- 
i ling is not allowed, move () changes the current (y, x) coordinates to whatever 

you want them to be. It returns ERR if you try to move off the window when 
scrolling is not allowed. As mentioned above, you can combine the two into 
mvaddch () to do both things in one fell swoop. 
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Input 


Miscellaneous 


Finishing Up 


12.3. Cursor Motion 
Optimization: 
Standing Alone 


The other output functions, such as addstr () and printw (), all call 
addch () to add characters to the window. 

After you have put on the window what you want there, when you want the por¬ 
tion of the terminal covered by the window to be made to look like it, you must 
call refresh 0 . To optimize finding changes, ref resh () assumes that any 
part of the window not changed since the last refresh () of that window has 
not been changed on the terminal, that is, that you have not refreshed a portion of 
the terminal with an overlapping window. If this is not the case, the routines 
touchwin (), touchline (), and touchoverlap () are provided to make 
it look like the entire window has been changed, thus forcing refresh () check 
the whole subsection of the terminal for changes. 

If you call wref resh () with cursor, it will make the screen look like 
cursor thinks it looks like. This is useful for implementing a command to 
redraw the screen in case it get messed up. 

Input is essentially a mirror image of output. The complementaiy function to 
addch () is getch () which, if echo is set, calls addch () to echo the charac¬ 
ter. Since the screen package needs to know what is on the terminal at all times, 
if characters are to be echoed, the tty must be in raw or cbreak mode. If it is not, 
getch () sets it to be cbreak, reads in the character, and then resets the mode of 
the terminal to what it was before the call. 

All sorts of functions exist for maintaining and changing information about the 
windows. For the most part, the descriptions in section 5.4. should suffice. 

To do certain optimizations, and, on some terminals, to work at all, some things 
must be done before the screen routines start up. These functions are performed 
in getttmode () and setterm (), which are called by init scr (). To 
clean up after the routines, the routine endwin () is provided. It restores tty 
modes to what they were when init scr 0 was first called. Thus, anytime 
after the call to initscr, endwin () should be called before exiting. 

It is possible to use the cursor optimization functions of this screen package 
without the overhead and additional size of the screen updating functions. The 
screen updating functions are designed for uses where parts of the screen are 
changed, but the overall image remains the same. Certain other programs will 
find it difficult to use these functions in this manner without considerable 
unnecessary program overhead. For such applications, such as some “crt 
hacks”^^ and optimizing cat(l)-type programs, all that is needed is the motion 
optimizations. This, therefore, is a description of what goes on at the lower lev¬ 
els of this screen package. The descriptions assume a certain amount of familiar¬ 
ity with programming problems and some finer points of C. None of it is terribly 
difficult, but you should be forewarned. 


^ Graphics programs designed to run on character-oriented terminals. 
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Terminal Information 


To use a terminal’s features to the best of a program’s abilities, you must first 
know what they are. The termcap database describes these, but a certain 
amount of decoding is necessaiy, and there are, of course, both efficient and 
inefficient ways of reading them in. The algorithm that curses uses is taken 
from vi(l) and is efficient. It reads them into a set of variables whose names are 
two uppercase letters with some mnemonic value. For example, HO is a string 
which moves the cursor to the "home" position^^. As there are two types of vari¬ 
ables involving ttys, there are two routines. The first, gettmode (), sets some 
variables based upon the tty modes accessed by gtty (2) and stty(2). The 
second, setterm (), does a larger task by reading in the descriptions from the 
termcap database. This is the way these routines are used by initscr (): 


-------- 

if (isatty(0)) { 
gettmode; 

if (sp=getenv("TERM")) 
setterm(sp); 

} 

else 

N 

setterm(Def_term); 

_puts(TI); 

_puts(VS); 





isatty () checks to see if file descriptor 0 is a terminaP^. If it is, 
gettmode () sets the terminal description modes from a gtty (2) . 
geteny () is then called to get the name of the terminal, and that value (if there 
is one) is passed to setterm (), which reads in the variables from termcap 
associated with that terminal, getenv () returns a pointer to a string containing 
the name of the terminal, which we save in the character pointer sp. If 

isatty () returns false, the default terminal Def_^term is used. The TI and 

VS sequences initialize the terminal. _put s () is a macro which uses 

® () (see t ermcap(3X)) to put out a string. It is these things which 
endwin () undoes. 


Movement Optimizations, or, Now that we have all this useful information, it would be nice to do something 

Getting Over Yonder with it. The most difficult thing to do properly is motion optimization. When 

you consider how many different features various terminals have (tabs, backtabs, 
non-destractive space, home sequences, absolute tabs,...) you can see that 
deciding how to get from here to there can be a decidedly non-trivial task. 

After using gettmode () and setterm () to get the terminal descriptions, the 
function mvcur () deals with this task. Its usage is simple: you simply tell it 
where you are now and where you want to go, as shown below. 


* These names are identical to those variables used in the /etc/termcap database to describe each 
capability. See Appendix A for a complete list of those read, and termcap(5) for a full description. 

^ Isatty 0 is defined in the default Clibraiy function routines. It does a gtty (2) on the file descriptor 
and checks the return value. 
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12.4. Curses Functions 


Output Functions 

addch () and waddch () — 
Add Character to Window 


addstrO andwaddstrO 
— Add String to Window 




mvcur(0, 0, LINES/2, COLS/2) 


_—- 

J 


would move the cursor from the home position (0,0) to the middle of the screen. 
If you wish to force absolute addressing, you can use the function tgot o () 
from the termcap(3X) routines, or you can tell mvcur () that you are impossi¬ 
bly far away. For example, to absolutely address the lower left hand comer of 
the screen from anywhere just claim that you are in the upper right hand comer: 


f 

A 

mvcur(0, COLS-1, LINES-1, 0) 

_^- 

J 


In the following definitions, *’ means that die ‘function’ is really a #def ine 
macro with arguments. This means that it will not show up in stack traces in the 
debugger, or, in the case of such functions as addch (), it will show up as its 
‘w’ counterpart. The arguments are given to show the order and type of each. 
Their names are not mandatory, just suggestive. 


addch(ch) 
char ch; 

waddch(win, ch) 

WINDOW *win; 
char ch; 

Add the character ch on the window at the current (y, x) co-ordinates. If the 
character is a I NEWLINEl (An') the line is cleared to the end, and the current 
(y, x) co-ordinates are changed to the beginning of the next line if newline map¬ 
ping is on, or to the next line at the same x co-ordinate if it is off. A return (Ar') 
moves to the beginning of the line on the window. Tabs (At') are expanded into 
spaces in the normal tabstop positions of eveiy eight characters. This returns 
ERR if it would cause the screen to scroll illegally. 


addstr(st) 
char *str; 

waddstr(win, str) 

WINDOW *win; 
char *str; 

Add the string pointed to by str on the window at the current (y, x) co¬ 
ordinates. This returns ERR if it would cause the screen to scroll illegally. In fiiis 
case, it puts on as much as it can. 
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box () — Draw Box Around 

Window box(win, vert, hor) 

WINDOW *win; 
char vert, hor; 

Draws a box around the window using vert as the character for drawing the 
vertical sides, and hor for drawing the horizontal lines. If scrolling is not 
allowed, and the window encompasses the lower right-hand comer of the termi¬ 
nal, the comers are left blank to avoid a scroll. 


clear 0 and wclear 0 — 

Reset Window clear () 

wclear(win) 

WINDOW *win; 

Resets the entire window to blanks. If win is a screen, this sets the clear flag, 

which sends a dear-screen sequence on the next refresh 0 call. This also 
moves the current (y, x) co-ordinates to (0, 0). 


clearok () — Set Clear Flag 


clrtobotO and 
wclrtobot () — Clear to 
Bottom 


clearok(scr, boolf) 

WINDOW *scr; 
bool boolf; 

Sets the clear flag for the screen scr. If boolf is TRUE, this forces a dear- 
screen to be printed on the next refresh (), or stop it from doing so if boolf 
is FALSE. This only works on screens, and, unlike clear (), does not alter the 
contents of the screen. If scr is curscr, the next refresh () call causes a 
dear-screen, even if the window passed to refresh () is not a screen. 

clrtobotO 

wclrtobot(win) 

WINDOW *win; 

Wipes the window clear from the current (y, x) co-ordinates to the bottom. This 
does not force a dear-screen sequence on the next refresh under any cir¬ 
cumstances. This has no associated mv function. 


clrtoeolO and 
wclrtoeolO —Clearto 
End of Line 


clrtoeolO 

wclrtoeol(win) 

WINDOW *win; 

Wipes the window clear from die current (y, x) co-ordinates to the end of the 
line. This has no associated mv function. 


delchO andwdelchO — 
Delete Character 


delchO 

wdelch(win) 

WINDOW *win; 

Delete the character at the current (y, x) co-ordinates. Each character after it on 
the line shifts to the left, and the last character becomes blank. 
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deletelnO and 
wdeletelnO —-Delete 
Current Line 


deletelnO 

wdeleteln(win) 

WINDOW ♦win; 

Delete the current line. Every line below the current one moves up, and the bot¬ 
tom line becomes blank. The current (y, x) co-ordinates remains unchanged. 


erase and werase () — 

^ erase 0 

Erase Window 

werase(win) 

WINDOW *win; 

Erases the window to blanks without setting the clear flag. This is analagous to 
clear (), except that it never causes a dear-screen sequence to be generated on 
a refresh (). This has no associated mv function. 


flushok-Control Flushing 

Ofstdout WINDOW •win; 

bool boolf; 

Normally, refresh 0 performs an fflush () on stdout whenitis 
finished. flushok() allows you to control this. If boolf is TRUE (non-zero), 
refresh 0 performs the f flush (); if FALSE, refresh!) does not. 


idlok — Control Use of 
Insert/Delete Line 


idlok(win, boolf) 

WINDOW *win; 
bool boolf; 

Reserved for future use. When implemented, this will signal refresh () as to 
whether it is safe to use “insert line” and “delete line” sequences to update a 
window. 


inschO andwinschO — 
Insert Character 


insch(c) 
char c; 

winsch(win, c) 

WINDOW *win; 
char c; 

Insert c at the current (y, x) co-ordinates Each character after it shifts to the right, 
and the last character disappears. This returns ERR if it would cause the screen 
to scroll illegally. 
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insertlnO and 

winsertlnO —InsertLine insertln 

winsertln(win) 

WINDOW *win; 

Insert a line above the current one. Every line below the current line is shifted 
down, and the bottom line disappears. The current line becomes blank, and the 
current (y, x) co-ordinates remains unchanged. This returns ERR if it would 
cause the screen to scroll illegally. 

move and wmove () — Move 

move(y, x) 
int y, x; 

wmove(win, y, x) 

WINDOW *win; 
int y, x; 

Change the current (y, x) co-ordinates of the window to y, x. This returns ERR if 
it would cause the screen to scroll illegally. 


overlay!) —Overlay 
Windows 


overlay(winl, win2) 

WINDOW *winl, *win2; 

Overlay winl on win2. The contents of winl, insofar as they fit, are placed on 
win2 at their starting (y, x) co-ordinates. This is done non-destructively, that is, 
blanks on winl leave the contents of the space on win2 untouched. 


overwrite!) —Overwrite 

Windows overwrite(wxnl, win2) 

WINDOW *winl, *win2; 

Overwrite winl on win2. The contents of winl, insofar as they fit, are placed 
on win 2 at their starting (y, x) co-ordinates. This is done destructively, that is, 
blanks on winl become blank on win2. 


printw(fmt, argl, arg2, ...) 
char *fmt; 

wprintw(win, fmt, argl, arg2, ...) 

WINDOW *win; 
char ♦fmt; 

Performs a printf () on the window starting at the current (y, x) co-ordinates. 
It uses addstr () to add the string on the window. It is often advisable to use 
the field width options of printf () to avoid leaving things on the window 
from earlier calls. This returns ERR if it would cause the screen to scroll ille¬ 
gally- 


printwO andwprintwO 
— Print to Window 
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refresh 0 and 

wrefresh () — Synchronize 


refresh() 

wrefresh(win) 

WINDOW *win; 

Synchronize the terminal screen with the desired window. If the window is not a 
screen, only that part covered by it is updated. This returns ERR if it would cause 
the screen to scroll illegally. In this case, it updates whatever it can without caus¬ 
ing the scroll. 

As a special case, if wref resh () is called with the window cursor, the 
screen is cleared and repainted. This is useful for allowing the user to redraw the 
screen as needed. 


standout() 

wstandout(win) ' 

WINDOW *win; 

standend() 

wstandend(win) 

WINDOW *win; 

Start and stop putting characters onto win in standout() mode, standout () 
causes any characters added to the window to be put in standout mode on the ter¬ 
minal (if it has that capability), standend () stops this. The sequences SO and 
SE (or US and UE if they are not defined) are used (see Appendix A). 


standout() and 
wstandout 0 —Put 
Characters in Standout Mode 


Input Functions 

crbreak and nocrbreak — 
Set or Unset from Cbreak mode 


crbreak() 
nocrbreak0 

Set or unset the terminal to/from cbreak mode. The misnamed macros 
crmode () and nocrmode () are retained for backward compatibility. 


echoO and noecho 0 — h o 

Turn Echo On or Off ® ° ° 

noecho() 

Sets the terminal to echo or not echo characters. 


getch () and wgetch () — getch () 

Get Character from Terminal 

wgetch(win) 

WINDOW *win; 

Gets a character from the terminal and (if necessary) echos it on the window. 

This returns ERR if it would cause the screen to scroll illegally. Otherwise, the 
character gotten is returned. If noecho () has been set, then the window is left 
unaltered. In order to retain control of the terminal, it is necessary to have one of 



Revision A of 9 May 1988 



Chapter 12 —The curses Library: Screen-Oriented Cursor Motions 283 


getstrO andwgetstrO 
— Get String from Terminal 


raw () and noraw () — Turn 
Raw Mode On or Off 


scanw() andwscanwO — 
Read String from Terminal 


Miscellaneous Functions 


baudrate — Get the 
Baudrate 


noscho (), cbreak (), or rawmod© set. If you do not set one, whatever rou¬ 
tine you call to read characters sets cbreak for you, and then resets to the original 
mode when finished. 


getstr(st) 
char *str; 

wgetstr(win, str) 

WINDOW *win; 
char *str; 

Get a string through the window and put it in the location pointed to by str, 
which is assumed to be large enough to handle it. It sets tty modes if necessary, 
and then calls get ch () (or wgetch ( win )) to get the characters needed to 
fill in the string until a I NEWLINEl or EOF is encountered. The I NEWLINE I is 
stripped off the string. This returns ERR if it would cause the screen to scroll 
illegally. 


raw() 
norawO 

Set or unset the terminal to/from raw mode. On version 7 UNKt systems, this 
also turns off NEWLINE mapping (see nl ()). 

scanw(fmt, argl, arg2, ...) 
char *fmt; 

wscanw(win, fmt, argl, arg2, ...) 

WINDOW *win; 
char ♦fmt; 

Perform a scant () through the window using fmt. It does this using consecu¬ 
tive get ch ()’s (or wget ch ( win )’s). This returns ERR if it would cause 
the screen to scroll illegally. 


Returns the baud rate of the terminal. This is a system-dependent constant 
(defined in the header file <sys/tty. h>, which is included in <curses. h>). 


t UNIX is a registered trademark of AT&T. 
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delwin () — Delete a 
Window 


endwinO —Finish up 
Window Routines 


erasechar — Get Erase 
Character 


delwin(win) 

WINDOW *win; 

Deletes the window from existence. All resources are freed for future use by 
calloc (3) . If a window has a subwin () allocated window inside of it, 
deleting the outer window does not affect the subwindow, even though this does 
invalidate it. Therefore, subwindows should be deleted before their outer win¬ 
dows are. 


endwin() 

Finish up window routines before exit. This restores the terminal to the state it 
was in before init scr () (or gettmode () and setterm ()) was called, 
endwin () should always be called before exiting, endwin () does not itself 
exit — this is especially useful for resetting tty stats when trapping rubouts via 
signal (2) . 


erasechar() 

Returns the erase character for the terminal; that is, the character used by the ter¬ 
minal to erase single characters from the input. 


get cap () — Get Termcap 
Capability 


char *getcap(str) 
char *str; 

Return a pointer th the termcap capability described by str (see termcap(5) 
for details). 


getyx () — Get Current 
Coordinates 


getyx(win, y, x) 

WINDOW *win; 
int y, x; 

Puts the current (y, x) co-ordinates of win in the variables y and x. Since it is a 
macro, not a function, you do not pass the address of y and x. 


inchO and winch 0 —Get 
Character at Current 
Coordinates 


inch 0 

winch(win) 
WINDOW *win; 


Returns the character at 
This does not make any 
tion. 


the current (y, x) co-ordinates on the given window, 
changes to the window. This has no associated mv func- 
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init scr () — Initialize 
Screen Routines 


killchar — Get Kill 
Character 


leaveok () — Set Leave 
Cursor Flag 


longname () — Get Full 
Name of Terminal 


initscr 0 

Initialize the screen routines. This must be called before any of the screen rou¬ 
tines are used. It initializes the terminal-type data and such, and without it, none 
of the routines can operate. If standard input is not a tty, it sets the specifications 
to the terminal whose name is pointed to by Def_term (initialy dumb). If the 
boolean My_term is true, Def_term is always used. If the window size values 
for rows and columns as returned by the TIOCGWINSZ ioctl (2 ) request are 
non-zero, they are used. Otherwise, sizes are taken from the termcap descrip¬ 
tion. 


killchar() 

Returns the terminars line kill character; that is, the character used to erase an 
entire line from input. 


leaveok(win, boolf) 

WINDOW *win; 
bool boolf; 

Sets the boolean flag for leaving the cursor after the last change. If boolf is 
TRUE, the cursor is left after the last update on the terminal, and the current 
(y, x) co-ordinates for win are changed accordingly. If it is FALSE, it is moved 
to the current (y, x) co-ordinates. This flag (initially FALSE) retains its value 
until changed by the user. 

For example, say the current position is (0, 0) and we change the character at 
position (5,10) in the window. After calling refresh (), the cursor is either 
moved to position (5, 10) (if the fiag is TRUE) or the cursor is left at position 
(0,0) (if the fiag is FALSE). 


longname(termbuf, name) 
char *termbuf, *name; 

longname(termbuf, name) 
char *termbuf, *name; 

Fills in name with the long (full) name of the terminal described by the 
termcap entry in termbuf. It is generally of little use, but is nice for telling 
the user in a readable format what terminal we think he has. This is available in 
the global variable ttytype. termbuf is usually set via the termcap rou¬ 
tine tgetent. f ullname is the same as longname (), except that it gives 
the fullest name given in the entry, which can be quite verbose. 
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n 


Move the home position of hie window win from its current starting coordinates 
to y, X. If that would put part or all of the window off the edge of the terminal 
screen, mvwin () returns ERR and does not change any filing. For subwindows, 
mvwin <) also returns ERR if you attempt to move it off its main window. If 
you move a main window, all subwindows are moved along with it 


mvwin — Move Home Position 
of Window 


mvwin(win, y, x) 
WINDOW *win; 
int y, x; 


newwin () — Create a New 
Window 


WINDOW * 

newwin (lines, cols, begin__y, begin_x) 
int lines, cols, begin_y, begin_x; 

Create a new window with lines lines and cols columns starting at position 
begin_y, begin_x. If either lines or cols is 0 (zero), that dimension is 
set to (lines — begin_y) or (cols — begin_x) respectively. Thus, to 
get a new window of dimensions lines x cols, use 
newwin( 0, 0, 0, 0 ). 


nl () and nonl () — Turn 
Newline Mode On or Off 


nlO 
nonl 0 

Set or unset the terminal to/from nl () mode, that is, start/stop the system from 
mapping I RETURN1 to I NEWLINEI . If file mapping is not done, refresh () 
can do more optimization, so it is recommended, but not required, that it be 
turned off. 



scrollok(win, boolf) 

WINDOW *win; 
bool boolf; 

Set the scroll flag for the given window. If boolf is FALSE, scrolling is not 
allowed. This is its default setting. 


scrollok — Set Scroll Flag 
for Window 


subwin () — Create a 
Subwindow 


WINDOW ♦ 

subwin (win, lines, cols, begin__y, begin_x) 
WINDOW ♦win; 

int lines, cols, begin_y, begin_x; 


Create a new window with lines lines and cols columns starting at position 
(begin_y, begin_x) in the middle of the window win. This means that any 
change made to either window in the area covered by the subwindow is made on 
both windows. (begin_y, begin_x) are specified relative to the overall 
screen, not the relative (0,0) of win. If either lines or cols is 0 (zero), that 
dimension is set to (LINES — begin y) or (COLS — begin_x) respectively. 


n 
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touchline — Indicate Line 

Has Been Changed touchlme (win, y, startx, endx) 

® WINDOW ♦win; 

int y, startx, endx; 

This function performs a function similar to touchwin (), but on a single line. 
It marks the first change for the given line to be startx, if it is before the 
current first change mark, and the last change mark is set to be endx if it is 
currently less than endx. 


touchoverlap—Indicate touchoverlap (winl, win2) 

Overlapping Regions Have window *win, *win2; 

Been Changed 

Touch the window win2 in the area which overlaps with winl. If they do not 
overlap, no changes are made. 


touchwin () — Indicate 
Window Has Been Changed 


unctrlO —Return 
Representation of Character 


Details 

gettmode () — Get tty 
Statistics 


touchwin(win) 

WINDOW *win; 

Make it appear that the every location on the window has been changed. This is 
usually only needed for refreshes with overlapping windows. 


unctrl(ch) 
char ch; 

This is actually a debug function for the library, but it is of general usefulness. It 
returns a string which is a representation of ch. Control characters become their 
upper-case equivalents preceded by a ^ (circumflex character). Other letters stay 
just as they are. 


gettmode() 

Get the tty stats. This is normally called by init scr (). 


mvcur () — Move Cursor 


mvcur(lasty, lastx, newy, newx) 
int lasty, laatx, newy, newx; 


Moves the terminal’s cursor from lasty, lastx to newy, newx in an approxi¬ 
mation of optimal fashion. 


It is possible to use this optimization without the benefit of the screen routines. 
With the screen routines, this should not be called by the user, move () and 
refresh () should be used to move the cursor position, so that the routines 
know what’s going on. 
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scroll () — Scroll Window 


scroll(win) 

WINDOW *win; 

Scroll the window upward one line. This is normally not used by the user. 


savettyO andresettyO 
— Save and Reset tty Flags 


savettyO 
resetty() 

savettyO saves the current tty characteristic flags, resetty () restores 
them to what savetty () stored. These functions are performed automatically 
byinitscrO andendwin(). 


settermO —Set Terminal 
Characteristics 


setterm(name) 
char ♦name; 

Set the terminal characteristics to be those of the terminal named name, getting 
the terminal size from the TIOCGWINSZ ioctl (2) request if that size is non¬ 
zero, and otherwise from the environment. This is normally called by 
initscr (). 


tstpO 

This function saves the current tty state and then puts the process to sleep. When 
the process gets restarted, it restores the tty state and then calls 
wref resh (cursor) to redraw the screen. The initscr () function sets 
the signal SIGTSTP to trap to this routine. 

_putchar() 

Put out a character using the put char () macro. This function is used to out¬ 
put every character that curses generates. Thus, it can be redefined by the user 
who wants to do non-standard things with the output. It is named with an initial 
because it usually should be invisible to the programmer. 

12.5. Capabilities from Note that the description of terminals is a difficult business, and we only attempt 

t ermcap to summarize the capabilities here. For a full description see the termcap(5) 

manual pages. 

Overview Capabilities from t ermcap are of three kinds: string valued options, numeric 

valued options, and boolean options. The string valued options are the most 
complicated, since they may include padding information. 

Intelligent terminals often require padding on intelligent operations at high (and 
sometimes even low) speed. This is specified by a number before the string in 
the capability, and has meaning for the capabilities which have a LP at the front 
of their comment. This normally is a number of milliseconds to pad the opera¬ 
tion. In the current system which has no tme programmable delays, we do this 
by sending a sequence of pad characters (normally nulls, but can be changed — 
specified by PC). In some cases, the pad is better computed as some number of 


tstp 


_putchar() 
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milliseconds times the number of affected lines (to the bottom of the screen usu¬ 
ally, except when terminals have insert modes which will shift several lines.) 
This is specified as, for example, 12* before the capability, to say 12 mil¬ 
liseconds per affected whatever (currently always line). Capabilities where this 
makes sense say ‘P*’. 
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Variables Set By set term () 

Table 12-3 Variables Set by setterm () 


Type 


Name 

Pad 

Description 

char 

♦ 

AL 

p* 

Add new blank Line 

bool 


AM 


Automatic Margins 

char 

* 

BC 


Back Cursor movement 

bool 


BS 


Backspace works 

char 

* 

BT 

P 

Back Tab 

bool 


CA 


Cursor Addressable 

char 

♦ 

CD 

P* 

Clear to end of Display 

char 

♦ 

CE 

P 

Clear to End of line 

char 

♦ 

CL 

P* 

CLear screen 

char 

* 

CM 

P 

Cursor Motion 

char 

* 

DC 

p* 

Delete Character 

char 

* 

DL 

P* 

Delete Line sequence 

char 

* 

DM 


Delete Mode (enter) 

char 

♦ 

DO 


DOwn line sequence 

char 

* 

ED 


End Delete mode 

bool 


EO 


can Erase Overstrikes with '" 

char 

♦ 

El 


End Insert mode 

char 


HO 


HOme cursor 

bool 


HZ 


HaZeltine “ braindamage 

char 

♦ 

IC 

P 

Insert Character 

bool 


IN 


Insert-Null blessing 

char 

♦ 

IM 


enter Insert Mode (IC usually set, too) 

char 

♦ 

IP 

P* 

Pad after char Inserted using IM+IE 

char 

♦ 

LL 


quick to Last Line, column 0 

char 

♦ 

MA 


Ctrl character MAp for cmd mode 

bool 


MI 


can Move in Insert mode 

bool 


NC 


No Cr: \r sends \r\n then eats \n 

char 

♦ 

ND 


Non-Destructive space 

bool 


OS 


OverStrike works 

char 


PC 


Pad Character 

char 

* 

SE 


Standout End (may leave space) 

char 

* 

SF 

P 

Scroll Forwards 

char 

♦ 

SO 


Stand Out begin (may leave space) 

char 

♦ 

SR 

P 

Scroll in Reverse 

char 

♦ 

TA 

P 

TAb (not ^I or with padding) 

char 

♦ 

TE 


Terminal address enable Ending sequence 

char 

♦ 

TI 


Terminal address enable Initialization 

char 

* 

UC 


Underline a single Character 

char 


UE 


Underline Ending sequence 

bool 


UL 


UnderLining works even though lOS 

char 

♦ 

UP 


UPline 

char 

* 

US 


Underline Starting sequence 

char 

* 

VB 


Visible Bell 

char 

♦ 

VE 


Visual End sequence 

char 

♦ 

VS 


Visual Start sequence 

bool 


XN 


a Newline gets eaten after wrap 
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Names starting with X are reserved for severely nauseous glitches 

For purposes of standout (), if SG is not 0, SO is set to NULL, and if UG is not 
0, US is set to NULL. If, after this, SO is NULL, and US is not, SO is set to be US, 
and SE is set to be UE. 


Variables Set By 

gettmode() 


Table 12-4 Variables Set By gettmode () 


type 

name 

description 

bool 

NONL 

Term can’t hack linefeeds doing a CR 

bool 

GT 

Gtty indicates Tabs 

bool 

UPPERCASE 

Terminal generates only uppercase letters 


12.6. The WINDOW 
structure 


The WINDOW structure is defined as follows: 








/♦ 





* Copyright (c) 1980 Regents of the University of California. 

* All rights reserved. The Berkeley software License Agreement 



* specifies the terms and 

* 

conditions for redistribution. 



* 

@ {#) win_st .c 

6.1 (Berkeley) 4/24/86"; 



*/ 





# define 

WINDOW struct 

_win St 



struct 

win St { 





short 

_cury, curx; 




short 

_maxy, maxx; 




short 

_begy, begx; 




short 

_flags; 




short 

ch off; 




bool 

clear; 




bool 

leave; 




bool 

scroll; 




char 

** y; 




short 

*_firstch; 




short 

*_lastch; 



}; 

struct _win_st 

*_nextp, *_orig; 



# define 

_ENDLINE 001 




# define 

_FULLWIN 002 




# define 

_SCROLLWIN004 




# define 

_FLUSH 

010 



# define 

_FULLLINE 020 




# define 

_IDLINE 

040 



# define 

_STANDOUT 0200 




# define 

_NOCHAN‘GE -1 



- 




_; 
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_cury () and _curx () are the current (y, x) coordinates for the window. 
New characters added to the screen are added at this point. _maxy () and 
_maxx () are the maximum values allowed for (_cury, _curx). _begy () 
and _begx () are the starting (y, x) coordinates on the terminal for the window, 
that is, the window’s home. _cury (), _curx (), _maxy (), and _maxx () 
are measured relative to (_begy , _begx), not the terminal’s home. 

_clear () tells if a dear-screen sequence is to be generated on the next 
refresh 0 call. This is only meaningful for screens. The initial dear-screen 
for the first refresh () call is generated by initially setting clear to be TRUE 
for cursor, which always generates a dear-screen if set, irrelevant of the 
dimensions of the window involved. _leave () is TRUE if the current (y, x) 

( coordinates and the cursor are to be left after the last character changed on the 

terminal, or not moved if there is no change. _scr oll () is TRUE if scrolling 
is allowed. 

_y () is a pointer to an array of lines which describe the terminal. Thus: 

_y[i] 

is a pointer to the ith line, and 
_y[i][j] 

is the jth character on the ith line. _f lags () can have one or more values 
or’d into it 

For windows that are not subwindows, _or ig is NULL. For subwindows, it 
points to the maiii window to which the window is subsidiary. _nextp is a 
pointer in a circularly linked list of all the windows which are subwindows of the 
same main window, plus the main window itself. 

_f irstch and_lastch are malloc () ed arrays which contain the index of 
the first and last changed characters on the line. _ch_of f is the x offset for the 
window in the _f irstch and _lastch arrays for this window. For main win¬ 
dows, this is always 0; for subwindows it is the difference between the starting 
point of fire main window and that of the subwindow, so that change markers can 
be set relative to the main window. This makes these markers global in scope. 

All subwindows share the appropriate portions of _y (), _f irstch, _lastch, 
and _insdel with their main window. 

_ENDLINE says that the end of the line for this window is also the end of a 
screen. _FULLWIN says that this window is a screen. _SCROLLWIN indicates 
that the last character of this screen is at the lower right-hand comer of the termi¬ 
nal; that is, if a character was put there, the terminal would scroll. _FULLLINE 
says that the width of a line is the same as the width of the terminal. If FLUSH 


^ All variables not normally accessed directly by the user ate named with an initial to avoid conflicts 
with the user’s variables. 
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12.7. Example 


is set, it says that f flush (stdout) should be called at the end of each re¬ 
fresh (). ^STANDOUT says that all characters added to the screen are in stan¬ 
dout mode. _INSDEL is reserved for future use, and is set by idlok (). 

_f i^stch is set to _^NOCHANGE for lines on which there has been no change 

since the last refresh (). 

Here is a simple example of how to use the package. 

This example (twinkle) is intended to demonstrate the basic structure of a pro¬ 
gram using the screen updating sections of the package. 

This is a moderately simple program which prints pretty patterns on the screen 
that might even hold your interest for 30 seconds or more. It switches between 
patterns of asterisks, putting them on one by one in random order, and then tak¬ 
ing them off in the same fashion. 


# include <curses.h> 

# include <signal.h> 

/* 

* the idea for this program was a product 

* of the imagination of Kurt Schoens. Not 

* responsible for minds lost or stolen. 

*/ 


# define 

# define 

# define 

struct Iocs { 
char 

}; 


NCOLS 80 
NLINES 24 
MAXPATTERNS 


x; 


typedef struct Iocs LOGS; 

LOGS Layout[NGOLS * NLINES]; /* current board layout */ 


int 


Pattern, 

Numstars; 


/* current pattern number */ 

/* number of stars in pattern ♦/ 


mainO { 


char 

int 


♦getenv 0 
die(); 


srand(getpid()); 

initscr 0 ;• 
signal(SIGINT, die); 
noecho(); 
nonl {) ; 

leaveok(stdscr, TRUE); 
scrollok(stdscr, FALSE); 


/* initialize random sequence */ 


W 
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for (;;) { 

makeboardO ; 
puton('*') ; 
puton (' '); 

} 


/* make the board setup */ 
/* put on '*'s */ 

/* cover up with ' 's */ 


* On program exit, move the cursor to the lower 

* left corner by direct addressing, since current 

* location is not guaranteed. We lie and say we 

* used to be at the upper right corner to guarantee 

* absolute addressing. 

*/ 

dieO { 

signal(SIGINT, SIG_IGN); 
mvcur(0, COLS-1, LINES-1, 0); 
endwin(); 
exit(0); 


* Make the current board setup. It picks a random 

* pattern and calls ison() to determine if the 

* character is on that pattern or not. 

*/ 

makeboardO { 


reg int 
reg LOGS 


y^ x; 
*lp; 


Pattern = randO % MAXPATTERNS; 

Ip = Layout; 

for (y = 0; y < NLINES; y++) 

for (x = 0; X < NCOLS; x++) 
if (ison(y, x)) { 

lp->y = y; 

lp++->x = x; 

} 

Numstars = Ip - Layout; 


* Return TRUE if (y, x) is on the current pattern. 
*/ 

ison(y, x) 
reg int y, x; { 

switch (Pattern) { 

case 0: /♦ alternating lines */ 

return !(y & 01); 
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case 1: /♦ box */ ^ ^ > 

if (x >= LINES && y >= NCOLS) 
return FALSE; 

if (y < 3 I 1 y >= NLINES - 3) 
return TRUE; 

return (x < 3 || x >= NCOLS - 3); 
case 2 : /* holy pattern! */ 

return ( (x + y) & 01 ); 
case 3: /* bar across center */ 

return (y >= 9 && y <= 15 ) ; 

) 

/* NOTREACHED */ 


puton(ch) 

reg char ch; { 

reg LOGS *lp; 

reg int r; 

reg LOGS *end; 

LOGS temp; 

end = SLayout[Numstars]; 
for (Ip = Layout; Ip < end; lp++) ( 
r = randO % Numstars; 
temp = *lp; 

*lp = Layout[r]; 

Layout[r] = temp; 


for (Ip = Layout; Ip < end; lp++) { 
mvaddch(lp->y, lp->x, ch); 
refresh 0 ; 

} 

} 
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System V curses and terminfo 


Screen management programs are a common component of many commercial 
computer applications. These programs handle input and output at a video 
display terminal. A screen program might move a cursor, print a menu, divide a 
terminal screen into windows, or draw a display on the screen to help users enter 
and retrieve information from a database. 

This tutorial explains how to use the System V curses and terminfo 
libraries to write screen management programs on a SunOS system. This pack¬ 
age includes a library of C routines, a database of terminals and terminal capabil¬ 
ities, and a set of SunOS system support tools. To start you writing screen 
management programs as soon as possible, the tutorial does not attempt to cover 
every part of the package. For instance, it covers only the most frequently used 
routines and then points you to cur ses(3V) and terminf o(5V) in the SunOS 
Reference Manual for more information. 

Because the routines are compiled C functions, you should be familiar with the C 
programming language before using curses/terminf o. You should also be 
familiar with the C language Standard I/O library. 

This chapter has five sections: The Overview describes curses, terminfo, 
and the other components of the System V terminal information utilities package. 

Working with curses Routines describes the basic routines making up the 
curses(3V) library. It covers the routines for writing to a screen, reading from 
a screen, and building windows It also covers routines for more advanced 
screen management programs that draw line graphics, use a terminal’s soft 
labels, and work with more than one terminal at the same time. Many examples 
are included to show the effect of using these routines. 

Working with terminfo Routines describes the routines in the curses library 
that deal directly with the terminfo database to handle certain terminal capa¬ 
bilities, such as programming function keys. 

Working with the terminfo Database describes the terminfo database, 
related support tools, and their relationship to the curses library. 

curses Program Examples includes six programs that illustrate various 
curses routines. 


Here the term windows refers to a region within a single terminal screen. 
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13.1. Overview 
What is curses? 


curses(3V) is the library of routines that you use to write screen management 
programs on the SunOS system. The routines are C functions and macros; many 
of them resemble routines in the standard C library. For example, there’s a rou¬ 
tine printw () that behaves like print f(3V), and another named getch ( ) 
that behaves like get c(3V). The automatic teller program at your bank might 
use printw () to print its menus and getch () to accept your requests for 
withdrawals (or, better yet, deposits). A visual screen editor like the SunOS 
screen editor vi(l) might also use these and other curses routines. 

The curses library is located in the file /usr/51ib/libcurses . a. To 
compile a program using routines in this library, you must use the System V 
optional /usr / 5bin/cc(lV) command, and include the —Icurses on the 
command line so that the link editor can locate and load them: 

/usr/5bin/cc file.c— Icurses — o file 

The name curses comes from the cursor optimization that this library of rou¬ 
tines provides. Cursor optimization minimizes the amount a cursor has to move 
around a screen to update it. For example, if you had designed a screen editor 
program with curses routines and edited the sentence 

curses/terminfo is a great package for creating screens. 

to read 

curses/terminfo is the best package for creating screens. 

the program would output only the string ’thebest in place of ’.agreat The other 
characters would be preserved. Because the amount of data transmitted—^the 
output—^is minimized, cursor optimization is also referred to as output optimiza¬ 
tion. 

Cursor optimization takes care of updating the screen in a manner appropriate for 
the terminal on which a curses program is run. This means that the curses 
libraiy can do what is required to update any of a large number of different termi¬ 
nal types. It searches the terminf o database (described below) to find the 
correct description for a terminal. 

How does cursor optimization help you and those who use your programs? First, 
it saves you time in describing in a program how you want to update screens. 
Second, it saves a user’s time when the screen is updated. Third, it reduces the 
load on your system. Fourth, it handles a large variety of terminals on which 
your program might be mn. 

Here’s a simple curses program. It uses some of the basic curses routines to 
move a cursor to the middle of a screen and print the character string 
BullsEye. Each of these routines is described in the section Working with 
curses Routines later in this chapter. For now, just look at their names below 
and you will get an idea of what each of them does. 
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terminfo refers to both of the following: 

Terminfo Routines 

This is a group of routines within the curses library for handling certain 
terminal capabilities. You can use these routines to program function keys 
(if your terminal has programmable keys), or write filters, for example. 

Shell programmers, as well as C programmers, can use the terminfo rou¬ 
tines in their programs. 

Terminfo Database 

This is a database containing the descriptions of many terminals that can be 
used with curses programs. These descriptions specify the capabilities of 
a terminal and the way it performs various operations—^for example, how 
many lines and columns it has and how its control characters are interpreted. 

Each terminal description in the database is a separate, compiled file. You 
use the source code that terminf o(5V) describes to create these files and 
the command tic(8V) to compile them. 

The compiled files are normally located in the directories 
/usr/share/lib/terminfo/?. These directories have single character 
names, each of which is the first character in the name of a terminal. For exam¬ 
ple, an entry for a virtual terminal emulator is normally located in the file 
/usr/share/lib/terminfo/v/virtual. 


u 



What is terminfo? 

o 
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Here is a simple shell script that uses the terminf o database. 

Figure 13-2 A Shell Script Using terminf o Routines 

# Clear the screen and show the 0,0 position. 

# 

tput clear 

tput cup 00 # or tput home 

echo "<- this is 0 0" 

# 

# Show the 5,10 position. 

# 

tput cup 5 10 

echo "<- this is 5 10" 

V_ 


How curses and terminfo A screen management program with curses routines refers to the terminf o 
Work Together database at run time to obtain the information it needs about the terminal being 

used. 

For example, suppose you are using a virtual terminal emulator to display the 
simple “BullsEye” program shown above. To execute properly, the program 
needs to know how many lines and columns the terminal screen has, in order to 
print the BullsEye in the middle of it. The description of the ansi terminal 
type in the terminfo database contains these values. All the curses program 
needs to know beforehand is the name of the terminal type. This is generally set 
automatically when you log in. 


Other Components of the 
Terminal Information Utilities 
Package 


Here is a complete list of the components discussed in this tutorial: 
captoinf o(8V) 

a tool for converting terminal descriptions developed on earlier releases of 
the SunOS system to terminf o descriptions 

curses(3V) 

the curses library 

infocmp(8V) 

a tool for printing and comparing compiled terminal descriptions 
tabs(lV) 

a tool for setting non-standard tab stops 
terminfo(5V) 

the System V terminal information database 
tic(8V) 

a tool for compiling terminal descriptions for the terminf o database 
tput(lV) 

a tool for initializing the tab stops on a terminal and for outputting the value 
of a terminal capability 
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This section describes the basic cur ses routines for creating interactive screen 
management programs. It begins by describing the routines and other program 
components that every curses program needs to work properly. Then it tells 
you how to compile and run a curses program. Finally, it describes the most 
frequently used curses routines that 

□ write output to and read input from a terminal screen 

□ control the data output and input — for example, to print output in bold type 
or prevent it from echoing (printing back on a screen) 

□ manipulate multiple screen images (windows) 

□ draw simple graphics 

D manipulate soft labels on a terminal screen 

D send output to and accept input from more than one terminal. 

To illustrate the effect of using these routines, we include simple example pro¬ 
grams as the routines are introduced. We also refer to a group of larger examples 
located in the section curses Program Examples in this chapter. These larger 
examples are more challenging; some make use of routines not discussed here. 


13.2. Working with 
curses Routines 


What Every curses All curses programs need to include the header file <curses. h> and call the 

Program Needs routines initscr (), refresh () or similar related routines, and endwin (). 

The Header File <curses . h> The header file <curses. h> defines several global variables and data struc¬ 

tures and defines several curses routines as macros. 

To begin, let’s consider the variables and data stmctures defined. <curses. h> 
defines all the parameters used by curses routines. It also defines the integer 
variables LINES and COLS; when a curses program is run on a particular ter¬ 
minal, these variables are assigned the vertical and horizontal dimensions of the 
terminal screen, respectively, by the routine initscr () described below. The 
header file defines the constants OK and ERR, too. Most cur sss routines have 
return values; the OK value is returned if a routine is properly completed, and the 
ERR value if some error occurs. 

LINES and COLS are external (global) variables that represent the size of a ter¬ 
minal screen. The environment variables, LINES and COLUMNS, may be set in 
a user’s shell environment; a curses program uses the environment variables to 
determine the size of a screen. 

For more information about these variables, see The Routines initscr (), 
refresh 0, endwin () and More afeowr initscr () and Lines and 
Columns, below. 

Now let’s consider the macro definitions. The <curses . h> header file defines 
many curses routines as macros that call (other macros or) curses routines. 
The line 

fdefine refresh!) wrefresh(stdscr) 
shows when refresh is called, it is expanded to call the curses routine 
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wref resh (). The latter routine, in turn, calls die two curses routines 
wnoutrefresh () and doupdate (). Many other macros also combine two 
or three routines together to achieve a particular result. 

Macro expansion in curses programs may cause problems with certain sophis¬ 
ticated C features, such as the use of automatic incrementing variables. 

One final point about <cur ses. h>: it automatically includes <stdio. h> and 
the <t ermio. h>, terminal driver interface file. Including either file again in a 
program is redundant, but harmless. 

The Routines initscr (), The routines initscr (), refresh (), and endwin () initialize a terminal 

refresh (), and endwin () screen to an "in curses state," update the contents of the screen, and restore the 

terminal to an "out of curses state," respectively.v Use the simple program that 
we introduced earlier to learn about each of these routines: 


Figure 13-3 initscr (), refresh (), and endwin () in a Program 

#include <curses.h> 

main () 

{ 

initscr0; /* initialize terminal settings and <curses.h> 

data structures and variables */ 

move( LINES/2 - 1, COLS/2 - 4 ); 
addstr("Bulls") ; 

refresh 0 ; /* send output to (update) terminal screen */ 

addstr("Eye"); 

refresh 0; /* send more output to terminal screen */ 

endwin0; /* restore all terminal settings */ 

} 

_ ^ _ > 


A curses program usually starts by calling initscr (); the program should 
call initscr () only once. Using the environment variable TERM as the sec¬ 
tion How curses and terminf o Work Together describes, this routine deter¬ 
mines what terminal is being used. It then initializes all the declared data struc¬ 
tures and other variables from <curses . h>. For example, initscr () would 
initialize LINES and COLS for the sample program on whatever terminal it was 
run. If a virtual terminal emulator were to be used, this routine would initialize 
LINES to 24 and COLS to 80. Finally, this routine writes error messages to 
stderr and exits if errors occur. 

During the execution of the program, output and input is handled by routines like 
move () and addstr () in the sample program. For example, 

move( LINES/2 - 1, COLS/2 - 4 ); 
says to move the cursor to the left of the middle of the screen. Then the line 

addstr("Bulls") ; 


says to write the character string Bulls. With a virtual terminal, these routines 
would position the cursor and write the character string at (11,36). 
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All curses routines that move the cur¬ 
sor move it from its home position in 
the upper left comer of a screen. The 
(LINES, COLS) coordinate at this 
position is (0,0) not (1,1). Notice that 
the vertical coordinate is given first and 
the horizontal second, which is the 
opposite of the more common ’x,y* 
order of screen (or graph) coordinates. 


The 1 in the sample program takes the (0,0) position into account to place the 
cursor on the center line of the terminal screen. 

Routines like move () and addstr () do not actually change a physical termi- 
nal screen when they are called. The screen is updated only when refresh () 
is called. Before this, an internal representation of the screen called a window is 
updated. This is a very important concept, which we discuss below under Afore 
about refresh () and Windows. 


Finally, a curses program ends by calling endwin (). This routine restores 
all terminal settings and positions the cursor at the lower left comer of the screen. 


Compiling a curses 
Program 


You compile programs that include curses routines as C language programs 
using the /usr/5bin/cc command, which invokes the C compiler. 

The routines are stored in the library /usr/51ib/libcurses. a. To direct 
the link editor to search this library, you must use the -1 option with the cc 
command. 


The general command line for compiling a curses program follows: 

/usr/5bin/cc file.c —Icurses —o file 

file . c is the name of the source program; and file is the resulting executable pro¬ 
gram. 


More about init scr () and 
Lines and Columns 


After determining a terminal’s screen dimensions, init scr () sets the vari¬ 
ables LINES and COLS. These variables are set from the terminfo variables 
lines and columns. These, in turn, are set from the values in the terminfo 
database, unless overridden by the window size obtained by the TIOCGWINSZ 
ioctl(2) request. If that size is zero, the values of the environment variables 
LINES and COLUMNS are used. 


More about refresh 0 and 
Windows 


As mentioned above, curses routines do not update a terminal until 
refresh () is called. Instead, they write to an internal representation of the 
screen called a window. When refresh() is called, the accumulated output is 
sent from the window to the current terminal screen. 

A window acts a lot like the buffer used by vi(l). When you invoke vi to edit a 
file, the changes you make to the contents of the file are reflected in the buffer. 
The changes become part of the permanent file only when you use the w or ZZ 
command. Similarly, when you invoke a screen program made up of curses 
routines, they change the contents of a window. The changes become part of the 
current terminal screen only when refresh () is called. 

<curses. h> supplies a default window named stdscr (standard screen), 
which is the size of the current terminal’s screen, for all programs using curses 
routines. The header file defines stdscr to be of the type WINDOW*, a pointer 
to a C structure which you can think of as a two-dimensional array of characters 
representing a terminal screen. The program always keeps track of what is on the 
physical screen, as well as what is in stdscr. When refresh () is called, it 
compares the two screen images and sends a stream of characters to the terminal 
that make the current screen look like stdscr. A curses program considers 
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Figure 13-4 


A 


many different ways to do this, taking into account the various capabilities of the 
terminal, and similarities between what is on the screen and what is on the win¬ 
dow. It optimizes output by printing as few characters as is possible. The fol¬ 
lowing figure illustrates what happens when you execute the ‘ ‘BullsEye’ ’ curses 
program. 

You can create other windows and use them instead of stdscr. Windows are 
useful for maintaining several different screen images. For example, many data 
entry and retrieval applications use two windows: one to control input and output 
and one to print error messages that don’t mess up the other window. 

It is possible to subdivide a screen into many windows, refreshing each one of 
them as desired. When windows overlap, the contents of the current screen show 
the most recently refreshed window. It is also possible to create a window within 
a window; the smaller window is called a subwindow. Assume that you are 
designing an application that uses forms, for example, an expense voucher, as a 
user interface. You could use subwindows to control access to certain fields on 
the form. 

Some curses routines are designed to work with a special type of window 
called a pad. A pad is a window whose size is not restricted by the size of a 
screen or associated with a particular part of a screen. You can use a pad when 
you have a particularly large window or only need part of the window on the 
screen at any one time. For example, you might use a pad for an application with 
a spread sheet. 

The illustration below represents what a pad, a subwindow, and some other win¬ 
dows might look like in comparison to a terminal screen. 


Multiple Windows and Pads Mapped to a Terminal Screen 


terminal screen 


window 


2; 


subwindow 


pad 


window 


window 


subpad 
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The section Building Windows and Pads, later in this chapter, describes the rou 
tines you use to create and use them. 


Output and Input 


addch () — Write a single 
character to stdscr 


The routines that curses provides for writing to stdscr are similar to those 
provided by the stdio(3V) library for writing to a file. They let you: 

□ write a character at a time — addch () 

Q writeastring — addstr() 

□ format a string from a variety of input arguments — print w () 

□ move a cursor or move a cursor and print character(s) — move (), 
mvaddch(), mvaddstr(),mvprintw () 

□ clear a screen or a part of it — clear (), erase (), clrtoeol (), - 
clrtobotO 

Following are descriptions and examples of these routines. 

The curses library provides its own set of output and input functions. You 
should not use other I/O routines or system calls, like read(2) and write(2), in 

a curses program. They may cause undesirable results when you run the pro¬ 
gram. 

#include <curses.h> 
int addch(ch) 
chtype ch; 

addch 0 is a macro that writes a single character to stdscr. The character is 
of the type chtype, which is defined in <curses. h>. chtype contains both 
data and attributes (see Output Attributes in this chapter for information about 
attributes); when working with variables of this type, make sure you declare them 
as chtype, and not as the underlying data type (for example, short) of 
chtype. This will ensure future compatibility. 

addch 0 d oes some character translations. For example, it maps the 
1 NEWLINE) character to a clear-to-end-of-line, and moves the cursor to the next 
line. It maps the LIMJ character to an appropriate number of blanks. It maps 
other control characters to the appropriate ‘ notation. 

addch () normally returns OK. The only time addch () returns ERR is after 
adding a character to the lower right-hand comer of a window that does not 
scroll. 
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Example: 


r 

tinclude <curses.h> 

mainO 

{ 

initscrO; 
addch('a'); 
refreshO; 
endwin(); 

} 

\_ 

_^_ j 

produces: 

r 

. 








Also see the show program under curses Example Programs later in this 
chapter. 


addstr () — write a string of 
characters to stdscr 


#include <curses.h> 

int addstr(str) 
char *str; 



addstr (} is a macro that follows the same translation rules as addch (); it 
calls addchO to write each character, addstr () returns OK on success and 
ERR on error. 

For an example, refer to the “BullsEye” program, above. 

printwO —formatted tinclude <curses.h> 

printing on stdscr int printw(fmt [,arg...]) 

char *fint 

Like printf, printw () takes a format string and a variable number of argu¬ 
ments. Like addstr (), printw () calls addch () to write the string, 
printw () returns OK on success and ERR on error. 
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Example: 


tinclude <curses.h> 

main ( ) 

{ 

char* title = "Not specified”; 
int no = 0; 

initscr () ; 

printw(”%s is not in stock.\n*% title); 

printw( Please ask the cashier to order %d for you.Xn”# no)• 

refreshO; 

endwin(); 

} 


produces: 


Not specified is not in stock. 

Rlease ask the cashier to order 0 for you. 

-V 



J 


move () — position the cursor finclude <curses. h> 

forstdscr . ^ 

int move(y^ x); 

int y^ x; 

move () positions the cursor for st dscr at the given row y and the given 
column X. 

Notice that move () takes the y coordinate before the x coordinate. The upper 
left-hand coordinates for stdscr are (0,0), the lower right-hand (LINES - 1, 
COLS-1). See the section initscrO, ref resh(),a«£/endwin 0 for 
more information. 

move () returns OK on success and ERR on error. Trying to move to a screen 
position of less than (0,0) or more than (LINES -1, COLS - 1) causes an error. 
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mvprintw — move and print a finclude <curses. h> 

formatted string . ^ 

int mvprintwl y, x, fmt [,arg]... ) 

mvpr int w () is a macro that moves the cursor to a given position and prints a 
formatted string, of using move (). 

clear () and erase () — #include <curses.h> 

clear the screen 

int clear!) 
int erase!) 


clear!) and erase !) are macros that convert stdscr to all blanks, 
clear!) assumes (hat the screen may have garbage that it doesn’t know about; 
it first calls erase !) and then clearok !), which clears the physical screen 
completely on the next call to refresh !). init scr !) automatically calls 
clear!). 

clear!) always returns OK; erase !) returns no useful value. 

clrtoeol!) and finclude <curses.h> 

clrt obot!) - partial screen clrtoeol!) 

int clrtobot!) 


clrtoeol!) and clrtobot!) are macros that clear a portion of the screen, 
clrtoeol!) changes the remainder of a line to all blanks, clrtobot!) 
changes the remainder of a screen to all blanks. Both start with the current cur¬ 
sor position inclusive. 


Neither returns any useful value. 


Example: 


finclude <curses.h> 

main!) 

! 

initscr!); 

addstr!"Press <CR> to delete from here to the end of the line and on.")/ 

addstr!"\nDelete this too.\nAnd this."); 

move !0,30); 

refresh!); 

getch!); 

clrtobot!); 

refresh!); 

endwin!); 
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Input 


getchO —read a single 
character from the current 
terminal 


produces: 



Notice the two calls to refresh (): one to send the full screen of text to a ter¬ 
minal, the other to clear from the position indicated to the bottom of a screen. 

Here’s what the screen looks like when you press I RETURN 1 : 

-N 

Press <CR> to delete from here 

^> 


See the show and two programs under curses Example Programs for exam¬ 
ples of clrtoeol(). 


curses routines for reading from the current terminal are similar to those pro¬ 
vided by the s t dio(3V) library for reading from a file. They let you 


o read a character at a time — get ch () 

□ read a I NEWLINE 1 -terminated string — get st r () 


parse input, converting and assigning selected data to an argument list — 
scanw() 


The primary routine is get ch (), which processes a single input character and 
then returns that character. This routine is like the C libraiy routine 
get char () (3V) except that it makes several terminal- or system-dependent 
options available that are not possible with get char (). For example, you can 
use getch () with the curses routine keypad (), which allows a curses 
program to interpret extra keys on a user’s terminal, such as arrow keys, function 
keys, and other special keys that transmit escape sequences, and treat them as just 
another key. 


tinclude <curses.h> 
int getch0 

getch () is a macro that returns the value of the character or ERR on ‘end of 
file’, receipt of signals, or non-blocking read with no input. 

See the discussions about echo (), noecho (), cbreak (), nocbreak (), 
raw (), noraw (), half delay (), nodelay (), and keypad () below. 
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hxample: 


tinclude <curses.h> 

main () 

{ 

int ch; 
initscr(); 

cbreakO; /♦ Explained later in the section "Input Options” */ 

addstr("Press any character: "); 
refresh () ; 
ch = getch () ; 

printw("\n\n\nThe character entered was a '%c'.\n", ch); 

refreshO; 

endwin(); 


The first refresh () sends the addstr () character string from stdscr to 
the terminal: 


Press any character: ■ 


Then assume that a w is typed at the keyboard, getch () accepts the character 
and assigns it to ch. Finally, the second refresh () is called: 


/■-—-—-—--- 

Press any character: w 

.... 

The character entered was a 'w'. 





.. . / 


For another example of getch (), see the show program under curses Exam¬ 
ple Programs. 


get str () — read character 
string into a buffer 


finclude <curses.h> 

int getstr(str) 
char *str; 

get str () is a macro that calls getch () to read a string of characters into a 
buffer, until a 1 RETURN ) . I NEWLINE I . or 1 ENTER I key is received from 
stdscr.- get St r () does not check for buffer overflow. 

get str () returns ERR if getch () returns ERR; otherwise it returns OK. 

See the discussions about echo (), noecho (), cbreak (), nocbreak (), 
raw (), noraw (), half delay (), nodelay (), and keypad () below. 
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Example: 


#include <curses.h> 

main 0 

{ 

char str[256]; 


initscrO; 

cbreakO; /* Explained later in the section "Input Options" */ 

addstr("Enter a character string terminated by <CR>:\n\n"); 
refresh 0 
getstr(str); 

printw("\n\n\nThe string entered was \n'%s'\n", str); 

refreshO; 

endwin(); 


If you enter the string *I enjoy learning about the SunOS system’, the final screen 
(after entering I RETURN \ ) would appear as: 

. ^ ""■■■ '■■■■■■■''. ' “ ‘ .^ 

Enter * a character string terminated by <CR> : 

I enjoy learning about the SunOS system 
The string entered va$ 

'I enjoy learning about the SunOS system 

\___ j 


scanwO —formatted input 
conversion 


♦include <curses.h> 

int scanw(fmt [, arg...]) 
char ♦fmt; 


Like scant(3V), scanw () uses a format string to convert input words and 
assign them to a variable number of arguments, scanw () returns the same 
values as scant () 

See scant(3V) for more information. 
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Example: 


#include <curses.h> 

main () 

{ 

char string[100]; 
float number; 

initscrO; 

cbreakO; /* Explained later in the */ 

echoO; /♦ section "Input Options” */ 

addstr("Enter a number and a string separated by a comma: ”); 
refresh () ; 

scanw(”%f,%s”^ Snumber^ string); 
clear 0; 

printw( The string was \”%s\” and the number was %fstring^number); 
refresh 0; 
endwin() ; 


Notice the two calls torefresh(). The first call updates the screen with the 
character string passed to addstr (), the second with the string returned from 
scanw (). Also notice the call to clear (). Assume you entered the follow¬ 
ing when prompted: 2 , twin. After running this program, your terminal screen 
would appear, as follows: 


The string was "twin" and the number was 2.000000. 


$■ 


Controlling Output and Input 

Output Attributes When we talked about addch (), we said that it writes a single character of the 

type chtype to stdscr. chtype has two parts: a part with information about 
the character itself, and another part with information about a set of attributes 
associated with the character. These attributes allow a character to be printed in 
reverse video, bold, underlined, and so on. 

stdscr always has a set of current attributes that it associates with each charac¬ 
ter as it is written. However, using the routine attr set () and the related 
curses routines described below, you can change the current attributes. Below 
is a list of the attributes and what they mean. 
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Not all terminals are capable of 
displaying all attributes. If a particu¬ 
lar terminal cannot display a 
requested attribute, a curses pro¬ 
gram attempts to find a substitute attri¬ 
bute. If none is possible, the attribute is 
ignored. 


A_BLINK 

A_BOLD 

A_DIM 

A_REVERSE 

A_STANDOUT 

A_UNDERLINE 

A ALTCHARSET 


blinking 

extra bright or bold 
half bright 
reverse video 

a terminars best highlighting mode 

underlining 

alternate character set 


(See the section Drawing Lines and Other Graphics, below, for more informa¬ 
tion about these attributes.) 

To use these attributes, you must pass them as arguments to attrset () and 
related routines; they can also be OR’ed with the bitwise OR ( 1 ) to addch (). 

Let’s consider a use of one of these attributes. To display a word in bold, use the 
following code: 


^- 

A 

printw(”A word in ”) ; 


attrset(A_BOLD); 


printw("boldface"); 


attrset(O); 


printw(" really stands out.\n"); 


refresh(); 



-. -J 


Attributes can be turned on singly, such as attrset (A_BOLD) in the example, 
or in combination. To turn on blinking bold text, for example, you would use 
attrset (A_BLINK | A_BOLD ). Individual attributes can be turned on 
and off with the curses routines attron () andattroff () without affect¬ 
ing other attributes, attrset (0) turns all attributes off. 

Notice the attribute called A_STANDOUT. You might use it to make text attract 
the attention of a user. The particular hardware attribute used for standout is the 
most visually pleasing attribute a terminal has. Standout is typically imple¬ 
mented as reverse video or bold. Many programs don’t really need a specific 
attribute, such as bold or reverse video, but instead just need to highlight some 
text. For such applications, the A_STANDOUT attribute is recommended. Two 
convenient functions, standout () and standend () can be used to turn on 
and off this attribute, standend (), in fact, turns off all attributes. 

Bit Masks In addition to the attributes listed above, there are two bit masks called 

A_CHARTEXT and A_ATTRIBUTES. You can use these bit masks with the 
curses function inch () and the C logical AND (&) operator to extract the 
character or attributes of a position on a terminal screen. See the discussion of 
inch () for more information. 

Following are descriptions of attrset () and the other curses routines that 
you can use to manipulate attributes. 
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attron (),attrset(), and 
attrof f 0 — set or modify 
attributes 


tinclude <curses.h> 

int attron( attrs ) 
chtype attrs; 

int attrset( attrs ) 
chtype attrs; 

int attroff( attrs ) 
chtype attrs; 


Standout 0 and 
standendO —highlight 
with preferred attribute 


attron () turns on the requested attribute attrs in addition to any that are 
currently on. Attrs is of the type chtype and is defined in <curses. h>. 

attrset () turns on the requested attributes attrs instead of any that are 
currently turned on. 

attrof f () turns off the requested attributes, attrs, if they are on. 

Attributes may be combined using the bitwise OR (|). 

All return OK. 

Example: 

See the highlight program under curses Example Programs, below. 

♦include <curses.h> 

int standout 0 
int StandendO 


Bells, Whistles, and Flashing 
Lights 


standout () turns on the preferred highlighting attribute, A_STANDOUT, for 
the current terminal. This routine is equivalent to attron (A STANDOUT). 

StandendO turns off all attributes. This routine is equivalent to 
attrset(0). 

Both always return OK. 

Example: 

See the highlight program under curses Example Programs, below. 

Occasionally, you may want to get a user’s attentioa Two curses 
routines were designed to help you do this. They let you ring the terminal’s bell 
and flash its screen. 

flash 0 flashes the screen if possible, and otherwise rings the bell. Flashing 
the screen is intended as a bell replacement, and is particularly useful if the bell 
bothers someone within ear shot of the user. The routine beep () can be called 
when an audible bell is desired. (If for some reason the terminal is unable to 
beep, but able to flash, a call to beep () will flash the screen.) 
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beepO and flash 0 —ring #include <curses.h> 
bell or flash screen ^ ^ 

int beep() 

f lash () tries to flash the terminal screen, if possible, otherwise it tries to ring 
the terminal bell. 

beep () tries to ring the terminal bell, if possible, and, if not, tries to flash the 
terminal screen. 

Neither returns any useful value. 

Input Options The SunOS system does a considerable amount of processing on input before an 

application ever sees a character; amongst other things, it: 

□ echoes (prints back) characters to a terminal as they are typed 

□ interprets an erase character, typically I DELETE 1 and a line kill character, 
typically 

[ CTRL-IJI (control-U) 

o interprets a 1CTRL-D1 as end-of-file (EOF) character, 
o interprets interrupt and quit characters 

□ strips the character’s parity bit 

□ translates I RETURN I characters to (NEWLINE I s. 

Because a curses program maintains total control over the screen, curses 
turns off echoing; it does the echoing itself. For an interactive screen, you may 
not want the system to process characters in the standard way. Some curses 
routines, noecho () and cbreak (), for example, have been designed so that 
you can alter the standard character processing. Using these routines in an appli¬ 
cation controls how input is interpreted. 

Every curses program accepting input should set some input options so that 
when the program starts running, the terminal on which it runs will be in 
cbreak (), raw (), nocbreak (), or noraw () mode. Although the 
curses program starts up in echo () mode, as shown below, none of the other 
modes are guaranteed. 

The combination of noecho () and cbreak () is most common in interactive 
screen management programs. Suppose, for instance, that you don’t want the 
characters sent to your application program to be echoed wherever the cursor 
currently happens to be; instead, you want them echoed at the bottom of the 
screen. The curses routine noecho () is designed for this purpose. How¬ 
ever, when noecho () turns off echoing, normal erase and kill processing is still 
on. Using the routine cbreak 0 causes these characters to be uninterpreted. 
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13-5 Input Option Settings for cursBS Prograins 


Input 

Options 

Characters 

Interpreted Uninterpreted 

Normal 

’out of curses 
state’ 

interrupt, quit 
stripping 
<CR> to <NL> 
echoing 
erase, kill 

EOF 


Normal 

curses ’startup 
state’ 

echoing 

(simulated) 

All else 
undefined. 

ebreak() 
and echo () 

interrupt, quit 

stripping 

echoing 

erase, kill 

EOF 

ebreak() 
and noecho () 

interrupt, quit 
stripping 

echoing 
erase, kill 

EOF 

noebreak() 
and noecho () 

break, quit 
stripping 
erase, kill 

EOF 

echoing 

noebreak() 
and echo () 

See cautii 

m below. 

nl() 

<CR> to <NL> 


nonl () 


<CR> to <NL> 

raw 0 
(instead of 
ebreak()) 


break, quit 
stripping 


Do not use the combination noebreak () and noecho (). If you use it in a 
program and also use getch (), the program will go in and out of ebreak () 
mode to get each character. Depending on the state of the terminal driver when 
each character is typed, the program may produce undesirable output. 

In addition to the routines noted above, you can use the curses routines 
noraw (), half delay (), and nodelay () to control input. These routines 
are described in curses(3V). 
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echo () and noecho () — 
turn echoing on and off 


cbreakO and nocbreak () 
— turn “break for each 
character” on or off 


Building Windows and Pads 


Window Output and Input 


#include <curses.h> 

int echo() 
int noecho() 

echo () turns on echoing of characters by curses as they are read in. This is 
the initial setting. 

noecho () turns off the echoing. 

Neither returns any useful value. 

curses programs may not run properly if you turn on echoing with noc¬ 
break (). After you turn echoing off, you can still echo characters with 
addch(). 

Examples: 

See the editor and show programs under curses Program Examples, 
below. 

tinclude < curses.h > 
int cbreakO 
int nocbreak() 

cbreak () turns on ’break for each character’ processing. A program gets each 
character as soon as it is typed, but the erase, line kill, and ICTRL-D1 characters 
are not interpreted. 

nocbreak () returns to normal ’line at a time’ processing. This is typically the 
initial setting. 

Neither returns any useful value. 

A curses program may not run properly if cbreak () is turned on and off 
within the same program or if the combination nocbreak () and echo () is 
used. 

Example: 

See the editor and show programs under curses Program Examples. 

The section above entitled More about refresh () and Windows explained 
what windows and pads are and why you might want to use them. This section 
describes the curses routines you use to manipulate and create windows and 
pads. 

The routines that you use to send output to and get input from windows and pads 
are similar to those you use with stdscr. The only difference is that you have 
to give the name of the window to receive the action. Generally, these functions 
have names formed by putting the letter w at the beginning of the name of a 
stdscr routine and adding the window name as the first parameter. For exam¬ 
ple, addch ('c') would become waddch (mywin, 'c') if you 
wanted to write the character c to the window mywin. Here’s a 
list of the window (or w) versions of the output routines discussed in Getting 
Simple Output and Input. 
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The Routines 
wnoutrefresh0 and 
doupdate() 


waddch(win, ch) 
mvwaddch(win, y, x, ch) 
waddstr(win, str) 
mvwaddstr(win, y, x, str) 
wprintw(win, fmt [, arg ...]) 
mvwprintw(win, y, x, fmt [, arg ...]) 
wmove (win, y, x) 
wclear(win) and werase(win) 
wclrtoeol(win) and wclrtobot(win) 
wrefresh 0 

You can see from their declarations that these routines differ from the versions 
that manipulate stdscr only in their names and the addition of a win argument. 
Notice that the routines whose names begin with mvw take the win argument 
before the y, x coordinates, which is contrary to what the names imply. See 
c'urses(3V) for more information about these routines, or the versions of the 
input routines getch, get str (), and so on that you should use with win¬ 
dows. 

All w routines can be used with pads except for wrefresh () and 
wnoutrefresh (). In place of these two routines, you have to use 
prefresh 0 and pnoutrefresh () with pads. 

If you recall from the earlier discussion about refresh (), we said that it sends 
the output from stdscr to the terminal screen. We also said that it was a macro 
that expands to wrefresh (stdscr) (see WhatEvety curses Program 
Needs and More about ref resh () and Windows). 

The wrefresh () routine is used to send the contents of a window (stdscr or 
one that you create) to a screen; it calls the routines wnoutrefresh () and 
doupdate (). Similarly, prefresh () sends the contents of a pad to a screen 
by calling pnoutrefresh () and doupdate (). 

Using wnoutrefresh () —or pnoutrefresh () (this discussion will be 
limited to the former routine for simplicity)—and doupdate (), you can update 
terminal screens with more efficiency than using wrefresh () by itself, 
wrefresh () works by first calling wnoutrefresh (), which copies the 
named window to a data structure referred to as the virtual screen. The virtual 
screen contains what a program intends to display at a terminal. After calling 
wnoutrefresh (), wrefresh () then calls doupdate (), which compares 
the virtual screen to the physical screen and does the actual update. If you want 
to output several windows at once, calling wrefresh () will result in alternat¬ 
ing calls to wnoutrefresh () and doupdate (), causing several bursts of 
output to a screen. However, by calling wnoutrefresh () for each window 
and then doupdate () only once, you can minimize the total number of charac¬ 
ters transmitted and the. processor time used. The sample program below uses 
only one doupdate (). 
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- ' ' 

#include <curses.h> 

main () 

{ 

WINDOW fwl, *w2; 
initscr 0 ; 

wl = newwin(2,6,0,3); 
w2 = newwin(1,4,5,4); 
waddstr(wl, "Bulls"); 
wnoutrefresh(wl); 
waddstr(w2, "Eye"); 
wnoutrefresh(w2); 
doupdate(); 
endwin() ; 

} 

^ _ - 


Notice from the sample that you declare a new window at the beginning of a 
curses program. The lines 



,, , 

wl = newwin(2^6r0,3); 


w2 = newwin(1^4^5^4); 


V_ 

. ^ 


declare two windows named wl and w2 with the routine newwin () according 
to certain specifications. 


New Windows Following are descriptions of the routines newwin () and subwin (), which 

you use to create new windows. For information about creating new pads with 
newpad () and subpad (), see curses(3V). 

newwin () — open and return #include <curses. h> 

a pointer to new window window *newwin (nlines, ncols, begin_y, begin_x) 

int nlines, ncols, begin_y, begin_x; 

newwin () returns a pointer to a new window with a new data area. The vari¬ 
ables nlines and ncols give the size of the new window, begin y and 
begin_x give the screen coordinates from (0,0) of the upper left comer of the 
window as it is refreshed to the current screen. 

Example: 

See the window program under curses Program Examples. 
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subwin 0 #include <curses.h> 

WINDOW *subwin(orig, nlines, ncols, begin y, begin x) 
WINDOW *orig; ~ 

int nlines, ncols, begin__y, begin x; 

subwin () returns a new window that points to a section of another window, 
orig. nlines and ncols give the size of the new subwindow, begin y 
and begin_x give the screen coordinates of the upper left comer of the window 
as it is refreshed to the current screen. 

Subwindows and original windows can accidentally overwrite one another. 
Subwindows of subwindows are not allowed. 

Example: 


#include <curses.h> 

main() 

{ 

WINDOW *sub; 
initscrO; 

box(stdscr,'w','w'); /* See the curses(3V) manual page for box() +/ 

mvwaddstr (stdscr,?, 10, "-this is 10,10”); 

mvwaddch(stdscr, 8,10,' |') ; 

mvwaddch(stdscr,9,10,'v'); 

sub = subwin(stdscr,10,20, 10 , 10 ) ; 

box(sub,' s',' s') ; 

wnoutrefresh(stdscr) ; 

wrefresh(sub); 

endwin(); 


This program prints a border of ws around the stdscr (the sides of your termi¬ 
nal screen) and a border of s characters around the subwindow sub when it is 
run. 


Using Advanced curses Knowing how to use the basic curses routines to get output and input and to 

Features work with windows, you can design screen management programs that meet the 

needs of many users. The curses library, however, has routines that let you do 
more in a program than handle I/O and multiple windows. The following few 

pages briefly describe some of these routines and what they can help you do_ 

namely, draw simple graphics, use a terminal’s soft labels, and work with more 
than one terminal in a single curses program. 

You should be comfortable using the routines previously discussed in this 
chapter and the other routines for I/O and window manipulation discussed on the 
curses(3V) manual page before you try to use the advanced curses features. 
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Many terminals have an alternate character set for drawing simple graphics (or 
glyphs, or graphic symbols). You can use this character set in curses pro¬ 
grams. curses use the same names for glyphs as the VTIOO line drawing char¬ 
acter set 

To use the alternate character set in a curses program, pass a set of variables 
whose names begin with ACS_ to the curses routine waddch () or a related 
routine. For example, ACS_ULCORNER is the variable for the upper left corner 
glyph. If a terminal has a line drawing character for this glyph, 
ACS_ULCORNER’s value is the terminal’s character for that glyph, ORed (1) 
with the bit-mask A_ALTCHARSET. If no line-drawing character is available for 
that glyph, a standard ASCII character that approximates the glyph is stored in its 
place. For example, the default character for ACS_HLINE, a horizontal line, is a 
- (minus sign). When a close approximation is not available, a + (plus sign) is 
used. All the standard ACS_ names and their defaults are listed in curses(3V). 

Part of an example program that uses line drawing characters follows. The 
example uses the curses routine box () to draw a box around a menu on a 
screen, box () uses the line drawing characters by default or when | (the pipe) 
and — are chosen. (See curses(3V).) Up and down more indicators are drawn 
on the box border (using ACS_UARROW and ACS_DARROW) if the menu con¬ 
tained within the box continues above or below the screen: 

-\ 

box(menuwin, ACS_VLINE, ACS_HLINE); 

/* output the up/down arrows */ 
wmove(menuwin, maxy, maxx - 5); 

/* output up arrow or horizontal line */ 
if (moreabove) 

waddch(menuwin, ACS_UARROW); 
else 

addch(menuwin, ACS_HLINE); 

/♦output down arrow or horizontal line */ 
if (morebelow) 

waddch(menuwin, ACS_DARROW); 
else 

waddch(menuwin, ACS_HLINE); 

_ ; 


Routines for Drawing Lines and 
Other Graphics 


Here’s another example. Because a default down arrow (like the lowercase letter 
v) isn’t very discernible on a screen with many lowercase characters on it, you 
can change it to an uppercase V. 

- ^ -- 

if ( ! (ACS_DARROW & A_ALTCHARSET)) 

ACS_DARROW = 'V'; 

_ > 
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Routines for Using Soft Labels Another feature available on most terminals is a set of soft labels across the bot¬ 

tom of their screens. A terminal’s soft labels are usually matched with a set of 
hard fiinction keys on the keyboard. There are usually eight of these labels, each 
of which is usually eight characters wide and one or two lines high. 

The curses library has routines that provide a uniform model of eight soft 
labels on the screen. If a terminal does not have soft labels, the bottom line of its 
screen is converted into a soft label area. It is not necessary for the keyboard to 
have hard function keys to match the soft labels for a curses program to make 
use of them. 

Let s briefly discuss most of the curses routines needed to use soft labels: 
slk_init 0,slk_set 0 , slk_refresh () and slk_noutrefresh(), 
slk_clear, and slk_restore. 

When you use soft labels in a curses program, you have to call the routine 

() before init scr (). This sets an internal flag for init scr () to 
look at that says to use the soft labels. If init scr () discovers that there are 
fewer than eight soft labels on the screen, that they are smaller than eight charac¬ 
ters in size, or that there is no way to program them, then it will remove a line 
from the bottom of stdscr to use for the soft labels. The size of stdscr and 
the LINES variable will be reduced by 1 to reflect this change. A properly writ¬ 
ten program, one that is written to use the LINES and COLS variables, will con¬ 
tinue to run as if the line had never existed on the screen. 

slk_init () takes a single argument. It determines how the labels are 
grouped on the screen should a line get removed from stdscr. The choices are 
between a 3-2-3 arrangement, and a 4-4 arrangement. The curses routines 
adjust the width and placement of the labels to maintain the pattern. The widest 
label generated is eight characters. 

The routine slk_set () takes three arguments, the label number (1-8), the 
string to go on the label (up to eight characters), and the justification within the 
label (0 = left-justified, 1 = centered, and 2 = right-justified). 

The routine slk_noutrefresh () is comparable to wnoutrefresh () in 
that it copies the label information onto the internal screen image, but it does not 
cause the screen to be updated. Since a wrefresh () commonly follows, 
slk_noutref resh () is the function that is most commonly used to output 
the labels. 

Just as wrefresh 0 is equivalent to a wnoutrefresh () followed by a 
doupdate (), so too the function slk_ref resh () is equivalent to a 
slk_noutref resh () followed by a doupdate (). 

To prevent the soft labels from getting in the way of a shell escape, 
slk_^clear () may be called before doing the endwin (). This clears the soft 
labels off the screen and does a doupdate (). The function 
slk_restore 0 may be used to restore them to the screen. Seethe 
curses(3V) manual page for more information about the routines for using soft 
labels. 
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A curses program can produce output on more than one terminal at the same 
time. This is useful for single process programs that access a common database, 
such as multi-player games. 

Writing programs that output to multiple terminals is a difficult business, and the 
curses library does not solve all the problems you might encounter. For 
instance, the programs—^not the library routines—^must determine the filename 
and terminal-type of each terminal. The standard method, checking TERM in the 
environment, does not work, because each process can only examine its own 
environment. 

Another problem you might face is that of multiple programs reading from one 
tty line. This situation produces a race condition and should be avoided. How¬ 
ever, a program trying to take over another terminal cannot just shut off whatever 
program is currently running on its line. (Usually, security reasons would also 
make this inappropriate. But, for some applications, such as an inter-terminal 
communication program, or a program that takes over unused terminal lines, it 
would be appropriate.) A typical solution to this problem requires each user 
logged in on a line to run a program that notifies a master program that the user is 
interested in joining the master program and tells it the notification program’s 
process ID, the name of the tty line, and the type of terminal being used. Then 
the program goes to sleep until the master program finishes. When done, the 
master program wakes up the notification program and all programs exit. 

A curses program handles multiple terminals by always having a current ter¬ 
minal. All fonction calls always affect the current terminal. The master program 
should set up each terminal, saving a reference to the terminals in its own vari¬ 
ables. When it wishes to affect a terminal, it should set the current terminal as 
desired, and then call ordinary curses routines. 

References to terminals in a curses program have the type SCREEN*. A new 
terminal is initialized by calling newterm ( type, outfd, infd) . newterm () 
returns a screen reference to the terminal being set up. type is a character string, 
naming the kind of terminal being used, outfd is a stdio(3V) file pointer 
(FILE*) used for output to the terminal and infd a file pointer for input from the 
terminal. This call replaces the normal call to initscr (), which calls 
newterm(getenv(''TERM'') , stdout, stdin). 

To change the current terminal, call set_term {sp) where sp is the screen refer¬ 
ence to be made current. set_t erm () returns a reference to the previous ter¬ 
minal. 

It is important to realize that each terminal has its own set of windows and 
options. Each terminal must be initialized separately with newtermO. 

Options such as cbreak () and noecho () must be set separately for each ter¬ 
minal. The functions endwin () and refresh () must be called separately 
for each terminal. The figure below shows a typical scenario to output a message 
to several terminals. 


Working with More than One 
Terminal 
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Figure 13-6 


13.3. Working with 

terminfo Routines 


terminf o routines should not be used 
directly, except in the circumstances 
noted at right; the equivalent curses 
routines protect your program from the 
idiosyncracies of physical terminals. 
When you use the terminfo routines, 
you must deal with them yourself. 

Also, these low-level routines may 
change, rendering programs that rely on 
them obsolete. 


What Every terminfo 
Program Needs 


Sending a Message to Several Terminals 


for {i=0; Knterm; i++) 

{ 

set_term (terms [i]) ; 

mvaddstr(0, 0, "Important message"); 
refresh(); 

} 


See the two program under curses Program Examples for a more complete 
example. 

Some programs need to use lower-level routines than those offered by the 
curses routines. For such programs, the terminfo routines are offered. 

They do not manage your terminal screen, but rather, give you access to strings 
and capabilities which you can use yourself to manipulate the terminal. 

There are three circumstances when it is proper to use terminfo routines 
directly. The first is when you need only some screen management capabilities, 
for example, making text standout on a screen. The second is when writing a 
filter. A typical filter does one transformation on an input stream without clear¬ 
ing the screen or addressing the cursor. If this transformation is terminal depen¬ 
dent and clearing the screen is inappropriate, use of the terminfo routines is 
wor&while. The third is when you are writing a special-purpose tool that sends a 
special string to the terminal, such as programming a function key, setting tab 
stops, sending output to a printer port, or dealing with the status line. 

Otherwise, you are discouraged from using these routines: the higher level 
curses routines make your program more portable to other SunOS systems, 
and to a wider class of terminals. 

A terminfo program typically includes the header files and routines shown 
below: 


Figure 13-7 Typical Framework of a terminfo Program 


/- -- — ---- 

#include <curses.h> 

s 

#include <term.h> 


setupterm( (char*)0, 1, (int*)0 ); 


putp(clear_screen); 


reset_shell_mode(); 


exit(0); 
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The header files <curses . h> and <term. h> are required because they con¬ 
tain the definitions of the strings, numbers, and flags used by the terminf o 
routines. setuptermO takes care of initialization. Passing this routine the 
values (char*)0, 1, and (int*) 0 invokes reasonable defaults. If setup- 
term () can’t figure out what kind of terminal you are on, it prints an error mes¬ 
sage and exits. reset_shell_mode O performs functions similar to 
endwin () and should be called before a terminf o program exits. 

A global variable like clear_screen is defined by the call to setup- 
term (). It can be output using the t erminf o routines putp () or tput s (), 
which gives a user more control. This string should not be directly output to the 
terminal using the C library routine print f(3V), because it contains padding 
information. A program that directly outputs strings will fail on terminals that 
require padding or that use the xon/xof f flow control protocol. 

At the terminf o level, the higher level routines like addch () and getch () 
are not available. It is up to you to output whatever is needed. For a list of capa¬ 
bilities and a description of what they do, see terminf o(5V); see curses(3V) 
for a list of all the terminf o routines. 

Compiling and Running a The general command line for compiling, and the guidelines for running a pro- 

t erminf o Program gram with terminf o routines are the same as those for compiling any other 

curses program. 

The example program, termhl, shows a simple use of terminf o routines. It 
is a version of the highlight program (see curses Program Examples) that 
does not use the higher level curses routines, termhl can be used as a filter. 
It includes the strings to enter bold and underline mode and to turn off all attri¬ 
butes. 


/* 

♦ A terminfo level version of the highlight program. 
*/ 

#include <curses.h> 

#include <term.h> 

int ulmode = 0; /* Currently underlining */ 

main(argc, argv) 
int argc; 
char **argv; 

{ 

FILE *fd; 
int c, c2; 
int outchO; 

if (argc > 2) 

{ 

fprintf(stderr, "Usage: termhl [file]\n"); ’ 
exit{1); 

} 


An Example terminfo 
Program 
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if (argc == 2) 

{ 

fd fopen (argv[l] , ”r"); 
if (fd =- NULL) 

{ 

perror(argv[l] ) ; 
exit (2); 

} 

} 

else 

{ 

fd = stdin; 

} 

setupterm((char*) 0, 1, (int*)0) ; 

for (;;) 

{ 

c = getc(fd); 
if (c =- EOF) 
break; 

if (c == '\') 

{ 

c2 - getc(fd); 
switch (c2) 

{ 

case 'B'; 

tputs(enter_bold_mode, 1, outch); 

continue; 

case ' U'' : 

tputs (enter_underline__mode, 1 , outch); 
ulmode = 1; 
continue; 
case 'N': 

tputs (ex it_attri but e__mode, 1 , outch); 

ulmode = O^- 

continue; 

} 

putch (c); 
putch (c2); 

} 

else 

putch (c); 

} 

fclose(fd); 
fflush(stdout); 
resettermO ; 
exit (0); 


* This function is like putchar, but it checks for underlining. 

putch(c) 
int c; 

{ 

outch(c); 

if (ulmode && underline_char) 

{ 
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outch('\b'); 

tputs(underline_char, 1, outch); 

) 

) 

/♦ 

* Outchar is a function version of putchar that can be passed to 

* tputs as a routine to call. 

*/ 

outch(c) 
int c; 

{ 

putchar(c); 

} 


Let’s discuss the use of the function tputs (cap, ajfcnt, outc) in this program 
to gain some insight into the terminfo routines, tputs () applies padding 
information. Some terminals have the capability to delay output. Their terminal 
descriptions in the terminfo database probably contain strings like $<20>, 
which means to pad for 20 milliseconds (see the following section Specifying 
Capabilities), tputs generates enough pad characters to delay for the appropri¬ 
ate time. 

tput () has three parameters. The first parameter is the string capability to be 
output. 

The second is the number of lines affected by the capability. Some capabilities 
may requite padding that depends on the number of lines affected. For example, 
insert_line may have to copy all lines below the current line, and may 
require time proportional to the number of lines copied. By convention qffcnt is 
1 if no lines are affected. The value 1 is used, rather than 0, for safety, since 
affcnt is rhultiplied by the amount of time per item, and anything multiplied by 0 
is 0. 

The third parameter is a routine to be called with each character. 

For many simple programs, ajfcnt is always 1 and outc always calls putchar. 
For these programs, the routine putp (cap) is a convenient abbreviation, 
termhl could be simplified by using putp (). 

Now to understand why you should use the curses level routines instead of 
terminfo level routines whenever possible, note the special check for the 
underline_char capability in this sample program. Some terminals, rather 
than having a code to start underlining and a code to stop underlining, have a 
code to underline the current character, termhl keeps track of the current 
mode, and if the current character is supposed to be underlined, outputs 
underline_char, if necessary. Low level details such as this are precisely 
why the curses level is recommended over the terminfo level, curses takes 
care of terminals with different methods of underlining and other terminal func¬ 
tions. Programs at the terminfo level must handle such details themselves. 

termhl was written to illustrate a typical use of the terminfo routines. It is 
more complex than it need be in order to illustrate some properties of ter¬ 
minfo programs. The routine vidattr (see curses(3V)) could have been 
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used instead of directly outputting enter_bold_mode, 
enter_underline_mode, and exit_attribute_mode. In fact, the pro¬ 
gram would be more robust if it did, since there are several ways to change video 
attribute modes. 

13.4. Working with the The terminfo database describes the many terminals with which curses pro- 

terminfo Database grams, as well as some SunOS system tools, like vi(l), can be used. Each ter¬ 

minal description is a compiled file containing the names that the terminal is 
known by and a group of comma-separated fields describing the actions and 
capabilities of the terminal. This section describes the terminfo database, 
related support tools, and their relationship to the curses library. 

Writing Terminal Descriptions of many popular terminals are already provided in the terminfo 

Descriptions database. However, it is possible that you’ll want to run a curses program on a 

terminal for which there is no existing description. In this case, you’ll have to 
build the description. 

The general procedure for building a terminal description is as follows: 

1. Give the known names of the terminal. 

2. Learn about, list, and define the known capabilities. 

3. Compile the newly-created description entry. 

4. Test the entry for correct operation. 

5. Go back to step 2, add more capabilities, and repeat, as necessary. 

Building a terminal description is sometimes easier when you build small parts 
of the description and test them as you go along. These tests can expose 
deficiencies in the ability to describe the terminal. Also, modifying an existing 
description of a similar terminal can make the building task easier. 


Naming the Terrriinal The name of a terminal is the first information given in a terminfo terminal 

description. This string of names, assuming there is more than one name, is 
separated by vertical bars (|). The first name given should be the most common 
abbreviation for the termrnal. The last name given is typically a verbose entry 
that fully identifies the terminal by make and model. The long name or “ver¬ 
bose’ ’ is typically the manufacturer’s formal name for the terminal. Names 
between the first and last entries are known synonyms for the terminal name. All 
but the verbose name should be typed in lowercase letters and contain no blanks. 
Naturally, the formal name is entered as closely as possible to the manufacturer’s 
name. 

Here is the name string from the description for a virtual terminal. 

virtual IVIRTUAIj IcbunixIcb-unixIcb-unix virtual terminal. 

Notice that the first name is the most commonly used abbreviation and the last is 
the long name. Also notice the comma at the end of the name string. 
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Here’s the name string for a fictitious terminal, myterm: 

myterm|mytm|mineI fancyIterminaliMy FANCY Terminal, 

Terminal names should follow common naming conventions. These conventions 
start with a root name, like virtual or myterm, for example. Possible 
hardware modes or user preferences should be shown by adding a hyphen and a 
’mode indicator’ at the end of the name. For example, the ’wide mode’ (which is 
shown by a -w) version of our fictitious terminal would be described as 
myterm—w. terminf o(5V) describes mode indicators in greater detail. 

Learning About the Capabilities After you complete the string of terminal names for your description, you have to 

learn about the terminal’s capabilities so that you can properly describe them. To 
learn about the capabilities your terminal has, you should do the following: 

See the owner’s manual for your terminal. It should have information about the 
capabilities available and the character strings that make up the sequence 
transmitted from the keyboard for each capability. 

Test the keys on your terminal to see what they transmit, if this information is 
not available in the manual. You can test the keys in one of the following wayss, 
type: 

stty -echo; cat —vu 

followed by the keys you want to test. To return to the shell and restore echo, 
type: 

•'D 

stty echo 

Note that stty echo is not displayed on the terminal screen. 


Specifying Capabilities Once you know the capabilities of your terminal, you have to provide them in 

your terminal description. Capability entries consist of a list of comma-separated 
fields containing the abbreviated terminfo name and, in some cases, the 
terminal’s value for each capability. For example, bel is the abbreviated name 
for the beeping or ringing capability. On most terminals, a I CTRL-G I is the 
instmction that produces a beeping sound. Therefore, the beeping capability 
would be shown in the terminal description as bel=''G,. 

The list of capabilities may continue across input lines as long as the continua¬ 
tion lines start with a white-space character, or consist of a comment. Comments 
can be included within the description by putting a # at the beginning of the line. 


For a curses program to run on any 
given terminal, its description in the 
terminf o database must include, at 
least, the capabilities to move a cursor 
in all four directions and to clear the 
screen. 


The terminf o(5V) manual page has a complete list of the capabilities you can 
use in a terminal description. 

A terminal’s character sequence (value) for a capability can be a keyed operation 
(like I CTRL-G 1 ), a numeric value, or a parameter string containing the sequence 
of operations required to achieve the particular capability. In a terminal descrip¬ 
tion, certain characters are used after the capability name to show what type of 
character sequence is required. Explanations of these characters are given below. 
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# This shows that a numeric value is to follow. This character follows a capa¬ 
bility that needs a number as a value. For example, the number of columns 
is defined as cols#80,. 

= This shows that the capability value is the character string that follows. This 
string instmcts the terminal how to act and may actually be a sequence of 
conimands. There are certain characters used in the instruction strings that 
have special meanings. These special characters follow: 

This shows a control character is to be used. For example, the beeping 
sound is produced by a CTRL-G. This would be shown as -Q. 

\E \e These characters followed by another character show an escape 
instruction. An entry of \EC would transmit to the terminal as 
I ESC-C. I 

\n These characters provide a I NEWLINE I character sequence. 

\ 1 These characters provide a I LINEFEED I character sequence. 

These characters provide a [RET^R^ character sequence. 

\ t These characters provide a I TAB I character sequence. 

\ b These characters provide a I BACKSPACE I character sequence. 

\ f These characters provide a I FORMFEED) character sequence. 

\ s These characters provide a 1 SPACE 1 character sequence. 

\nnn This is a character whose three-digit octal is nnn {nnn can be from 
one to three digits). 

$<«> These symbols are used to show a delay in milliseconds. The 

desired length of delay is enclosed inside the brackets. The amount 
of delay may be a whole number, a numeric value to one decimal 
place (tenths), or either form followed by an asterisk (*). The * 
shows that the delay is to be proportional to the number of lines 
affected by the operation. For example, a 20-millisecond delay per 
line would appear as $<2 0*>. See the terminf o(5V) manual 
page for more information about delays and padding. 

Sometimes, it may be necessary to comment out a capability so that the terminal 
ignores this particular field. This is done by placing a period (.) in front of the 
abbreviated name for the capability. For example, if you would like to comment 
out the beeping capability, the description entry would appear as 

• k)0l= f 

With this background information about specifying capabilities, let’s add the 
capability string to our description of myt erm. We’ll consider basic capabili¬ 
ties, screen-oriented capabilities, keyboard-entered capabilities, and parameter 
string capabilities. 
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Basic Capabilities Some capabilities common to most terminals are bells, columns, lines on the 

screen, and overstriking of characters, if necessary. Suppose our fictitious termi¬ 
nal has these and a few other capabilities, as listed below. Note that the list gives 
the abbreviated terminf o name for each capability in the parentheses follow¬ 
ing the capability description: 

□ An automatic wrap around to the beginning of the next line whenever the 
cursor reaches the right-hand margin (am). 

a The ability to produce a beeping sound. The instruction required to produce 
the beeping sound is ~G (bel). 

□ An 80-column wide screen (cols). 

□ A 30-line long screen (lines). 

□ Use of xon/xoff protocol (xon). 

By combining the name string with the capability descriptions that we now have, 
we get the following general terminf o database entry: 

f -^ 

mytermlmytmlmineI fancyIterminal|My FANCY terminal, 

am, bel=^G, cols#80, lines#30, xon, 

_ > 


Screen-Oriented Capabilities Screen-oriented capabilities manipulate the contents of a screen. Our example 

terminal my term has the following screen-oriented capabilities. Again, the 
abbreviated command associated with the given capability is shown in 
parentheses. 

□ A I RETURN! is a I CTRL-M I (cr). 

□ A cursor up one line motion is a ICTRL-K1 (cuul). 

□ A cursor down one line motion is a ICTRL-J I (cudl). 

□ Moving the cursor to the left one space is a ICTRL-HI (cubl). 

□ Moving the cursor to the right one space is a I C'TRL-L ) (cuf 1). 

□ Entering reverse video mode is an I ESCAPE-D I (smso). 

□ Exiting reverse video mode is an IESCAPE-Z) (rmso). 

n A clear to the end of a line sequence is an IESCAPE-K1 and should have a 
3-millisecond delay (el). 

A terminal scrolls when receiving a ( NEWLINE I at the bottom of a page (ind). 

The revised terminal description for myterm including these screen-oriented 
capabilities follows: 

( - - - ^^ 

mytermlmytmlmineI fancyIterminalIMy FANCY Terminal, 

am, bel="G, cols#80, lines#30, xon, 
cr=^M, cuul=*K, cudl=^J, cubl='H, cufl=''L, 
smso=\ED, rmso=\EZ, el=\EK$<3>, ind=\n, 
s_; 
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Keyboard-Entered Capabilities Keyboard-entered capabilities are sequences generated when a key is typed on a 

terminal keyboard. Most terminals have, at least, a few special keys on their key¬ 
board, such as arrow keys and the backspace key. Our example terminal has 
several of these keys whose sequences are, as follows: 

o The backspace key generates a rciRL-H I (kbs). 
a The up arrow key generates an 1 bS'CAPE-r A1 (kcuul). 

□ The down arrow key generates an ITSCAPfi-f TTI (kcudl). 
n The right arrow key generates an I HSCAPF.-r CI (kcuf 1). 
o The left arrow key generates an I ESCAPE-f D1 (kcubl). 

The home key generates an I ESCAPE-F H1 (khome). 

Adding this new information to our database entry for myterm produces: 


mytermlmytmlmine I fancy I terminal|My FANCY Terminal, 
am, bel='G, cols#80, lines#30, xon, 
cr=''M, cuul=''K, cudl=*J, cubl=''H, cufl=''L, 
smso=\ED, rmso=\EZ, el=\EK$<3>, ind=0 
kbs=''H, kcuul=\E[A, kcudl=\E[B, kcufl=\E[C, 
kcubl=\E[D, khome=\E[H, 


Parameter String Capabilities Parameter string capabilities are capabilities that can take parameters, such as 

those used to position a cursor on a screen, or to turn on a combination of video 
modes. To address a cursor, the cup capability is used and is passed two param¬ 
eters: the row and column to address. String capabilities, such as cup and set 
attributes (sgr) capabilities, are passed arguments in a terminfo program by 
the tparmO routine. 

The arguments to string capabilities are manipulated with special % sequences 
similar to those found in a call to print f(3V). In addition, many of the 
features found on a simple stack-based RPN calculator are available, cup, as 
noted above, takes two arguments: the row and column, sgr, takes nine argu¬ 
ments, one for each of the nine video attributes. See terminf o(5V) for the list 
and order of the attributes and further examples of sgr. 

Our fancy terminal’s cursor position sequence requires a row and column to be 
output as numbers separated by a semicolon, preceded by I ESCAPE-F 1 and fol¬ 
lowed with H. The coordinate numbers are 1-based rather than 0-based. Thus, to 
move to row 5, column 18, from (0,0), the sequence ;r "ESCAPE- [6 would be 
output. 

Integer arguments are pushed onto the stack with a %p sequence followed by the 
argument number, such as %p2 to push the second argument. A shorthand 
sequence to increment the first two arguments is ‘%i’. To output the top number 
on the stack as a decimal, a %d sequence is used, exactly as in print f. 
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Compiling the Description 


Our terminal’s cup sequence is built up as follows: 


cup= 

Meaning 

\E[ 

output ESCAPE- [ 

%i 

increment the two argurnents 

%pl 

push the 1st argument (the row) onto the stack 

%d 

output the row as a decimal 

> 

output a semi-colon 

%p2 

push the 2nd argument (the column) onto the stack 

%d 

output the column as a decimal 

H 

output the trailing letter 


or 


cup=\E[%i%pl%d;%p2%dH, 

Adding this new information to our database entry for myterm produces: 

- \ 

mytermImytmImine I fancy I terminal IMy FANCY Terminal, 
am, bel=^G, cols#80, lines#30, xon, 
cr=''M, cuul="K, cudl="J, cubl=*H, cufl=''L, 
smso=\ED, rmso=\EZ, el=\EK$<3>, ind=0 
kbs=^H, kcuul=\E[A, kcudl=\E[B, kcufl=\E[C, 
kcubl=\E[D, khome=\E[H, 
cup=\E[%i%pl%d;%p2%dH, 


See terminf o(5V) for more information about parameter string capabilities. 

The terminfo database entries are compiled using tic, the terminf o com¬ 
piler command. This compiler translates terminf o source entries into the 
compiled format used by the terminf o and curses routines. 

The source file for the source file is usually suffixed with . t i. For example, the 
description of myterm would be in a source file named myterm. t i. The com¬ 
piled description of myterm would usually be placed in 
/usr/share/lib/terminf o/m/myterm, since the first letter in the 
description entry is m. Links would also be made to synonyms of myterm, for 
example, to / f /fancy. If the environment variable TERMINFO were set to a 
directory and exported before the entry was compiled, the compiled entry would 
be placed in the TERMINFO directory. All programs using the entry would then 
look in the new directory for the description file if TERMINFO were set, before 
looking in the default /usr/share/lib/terminfo. The general format for 
the tic command is: 

tic [-v] [— c] sourcejile 
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Testing the Description 


Comparing or Printing 
terminfo Descriptions 


With the -V, verbose option, the compiler traces its actions and prints messages 
regarding its progress. The -c option checks for errors, t ic(8V) compiles only 
oiie file at a time. The following command line shows how to compile the ter¬ 
minfo source file for my term. 

tic -V myterm.ti 
Refer to tic(8V) for more information. 

r 

Let s consider ways to test a terminal description. First, you can test it by setting 
Ae environment variable TERMINFO to the path name of the directory contain¬ 
ing the description. If programs mn the same on the new terminal as they did on 
the older known terminals, then the new description is functional. 

Or, you can use the tput(lV) command. This command outputs a string or an 
integer according to the type of capability being described. If the capability is a 
Boolean expression, then tput sets the exit code (0 for TRUE, 1 for FALSE) and 
produces no output. The general format for the tput command is as follows: 

tput \-Tltype] capname 

The type of terminal you are requesting information about is identified with the 
—Ttype option. Usually, this option is not necessary because the default terminal 
name is taken from the environment variable TERM. The capname field is used 
to show what capability to output from the terminfo database. 

The following command line shows how to output the "clear screen" character 
sequence for the terminal being used: 

tput clear 

The following command line shows how to output the number of columns for the 
terminal being used: 

tput cols 

tput (8V) contains more information on the usage and possible messages associ¬ 
ated with this command. 

Sometime you may want to compare two terminal descriptions or quickly look at 
a description without going to the terminfo source directory. The 
inf ocmp(8V) command was designed to help you with both of these tasks. 
Compare two descriptions of the same terminal; for example. 


mkdir /tmp/old /tmp/new 

TERMINFO=/tmp/old tic oldvirtual.ti 

TERMINFO=/tmp/new tic newvirtual.ti 

infocmp —A /tinp/old —B /tmp/new —d virtual virtual 


compares the old and new virtual entries. 
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To print out the terminf o source for the virtual, type: 
infocmp -I virtual 


Converting a termcap The terminf o database is an alternative to the termcap database. Because 

Description to a terminf o of the many programs and processes that have been written with and for the 

Description termcap database, it is not feasible to do a complete conversion from 

termcap to terminf o. Since converting between the two requires experience 
with both, all entries into the databases should be handled with extreme caution. 
These files are important to the operation of your terminal. 

The captoinf o(8V) command converts termcap(5) descriptions to 
terminf o(5V) descriptions. When a file is passed to captoinf o, it looks for 
termcap descriptions and writes the equivalent terminf o descriptions on the 
standard output. For example, 

captoinfo /etc/termcap 

converts the file /etc/termcap to terminfo source, preserving comments 
and other extraneous information within the file. The command line 

captoinfo 

looks up the current terminal in the termcap database, as specified by the 
TERM and TERMCAP environment variables and converts it to terminfo. 

To convert a terminfo description into a termcap entry, use infocit 5 > -C. 

If you have been using cursor optimization programs with the -Itermcap or 
-Itermlib option in the /usr/ 5bin/cc command line, those programs 
should still be functional. 

13.5. curses Program The following examples demonstrate uses of curses routines. 

Examples 

The editor Program This program illustrates how to use curses routines to write a screen editor. 

For simplicity, editor keeps the buffer in st dscr; obviously, a real screen 
editor would have a separate data stmcture for the buffer. This program has 
many other simplifications: no provision is made for files of any length other 
than the size of the screen, for lines longer than the width of the screen, or for 
control characters in the file. 

Several points about this program are worth making. First, it uses the move (), 
mvaddstr (), flash (), wnoutrefresh () and clrtoeol () routines. 
These routines are all discussed in this chapter under Working with curses 
Routines. 

Second, it also uses some curses routines that we have not discussed. For 
example, the function to write out a file uses the mvinch () routine, which 
returns a character in a window at a given position. The data structure used to 
write out a file does not keep track of the number of characters in a line or the 
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number of lines in the file, so trailing blanks are eliminated when the file is writ- 
ten. The program also uses the insch (), delch (), insertln (), and 
delete In () routines. These ftinctions insert and delete a character or line. 

See curses(3V) for more information about these routines. 

Third, the editor command interpreter accepts special keys, as well as ASCII 
characters. On one hand, new users find an editor that handles special keys easier 
to learn about. For example, it s easier for new users to use the arrow keys to 
move a cursor than it is to memorize that the letter h means left, j means down, k 
means up, and 1 means right. On the other hand, experienced users usually like 
having the ASCII characters to avoid moving their hands from the home row 
position to use special keys. 


Since not all terminals have arrow 
keys, your curses programs will 
work with more terminals if there is an 
ASCII character associated with each 
special key. 


Fourth, the L CTRL-p command illustrates a feature most programs using 
curses routines should have. Often some program beyond the control of the 
routines writes something to the screen (for instance, a broadcast message) or 
some line noise affects the screen so much that the routines cannot keep track of 
it. A user invoking editor can type [ CTRL-L1 . causing the screen to be cleared 
and redrawn with a call to wrefresh (curscr). 

Finally, an other im portant po int is that the input command is terminated by 
LCTRL-D), not the [ ESCAPE ] key. It is very tempting to use I ESCAPE 1 as a 
c ommand, s ince it is one o f the few special keys available on all keyboards. 
(L B . ET(JRN 3 and (BREAK ) are the only others.) However, using escape as a 
separate key introduces an ambiguity. Most terminals use sequences of charac¬ 
ters beginning with escape (i.e., escape sequences) to control the terminal, and 
have special keys that send escape sequences to the computer. If a computer 
recdvesan escape from a terminal, it cannot tell whether the user <tepressed the 
1FSCAPE I key or whether a special key was pressed. 

editor and other cur S6S programs handle the ambiguity by setting a timer. 

If another character is received during this time, and if that character might be 
the beginning of a special key, the program reads more input until either a full 
special key is read, the time out is reached, or a character is received that could 
not have been generated by a special key. While this strategy works most of the 
time, it is not foolproof. It is possible for the user to press I ESCAPE 1 . then to 
type another key quickly, which causes the curses program to think a special 
key has been pressed. Also, a pause occurs until the escape can be passed to the 
user program, resulting in a slower response to the I ESCAPE 1 key. 

Many existing programs use [ ESCAPE 1 as a fundamental command, which can¬ 
not be changed without infuriating a large class of users. These programs cannot 
make use of special keys without dealing with this ambiguity, and at best must 
resort to a time-out solution. The moral is clear: when designing your curses 
programs, avoid the I ESCAPE 1 key. 
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editor — a Sample Program Listing 


/* editor: A screen-oriented editor. The user 

♦ interface is similar to a subset of vi, 

♦ The buffer is kept in stdscr to simplify 

♦ the program. 

*/ 

#include <stdio.h> 
finclude <curses.h> 

#define CTRL(c) ((c) & 037) 

main(argc, argv) 
int argc; 
char **argv; 

{ 

extern void perror(), exit(); 
int i, n, 1; 
int c; 

int line = 0; 

FILE ♦fd; 

if (argc 1 - 2 ) 

( 

fprintf(stderr, "Usage: %s file\n", argv[0]); 
exit (1); 

} 

fd = fopen(argv[1], "r") ; 
if (fd == NULL) 

{ 

perror(argv[1]); 
exit(2); 

} 

initscr 0; 
cbreak () ; 
nonl 0 ; 
noecho(); 

idlok(stdscr, TRUE); 
keypad(stdscr, TRUE); 

/* Read in the file */ 
while ( (c = getc(fd)) != EOF) 

{ 

if (c == '\n') 
line++; 

if (line > LINES - 2) 
break; 
addch(c); 

} 

fclose(fd); 

move(0,0); 
refresh 0 ; 
edit 0 ; 
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} 


/* Write out the file */ ~~ ~ 

fd = fopen(argv[l], "w"); 

for (1 = 0; 1 < LINES - 1; 1++) 

{ 

n “ len <1); 

for (i = 0; i < n; i++) 

putc(mvinch(l, i) & A_CHARTEXT, fd); 
putc('\n', fd); 

} 

fclose(fd); 

endwin () ; 
exit(0); 


len(lineno) 
int lineno; 

{ 

int linelen 


COLS - 1; 


} 


while (linelen >= 0 SS mvinch(lineno, linelen) == 
linelen—; 
return linelen +1; 


/* Global value of current cursor position */ 
int row, col; 


edit 0 

{ 


int c; 
for (;;) 
{ 


move(row, col); 
refresh(); 
c = getch () ; 

/* Editor commands ♦/ 
switch (c) 

{ 

/* hjkl and arrow keys: move cursor 
* in direction indicated ♦/ 
case 'h': 
case KEY_LEFT: 

if (col > 0) 
col—; 

else 

flash 0; 

break; 

j' : 

case KEY_D0WN: 

if (row < LINES - 1) 
row++; 

else 

flash 0; 

break; 

case 'k': 
case KEY UP: 
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if (row > 0) 
row—; 

else 

flashO; 

break; 

case '1': 
case KEY_RIGHT: 

if (col < COLS - 1) 
col++; 

else 

flash 0; 

break; 

/* i: enter input mode */ 
case KEY_IC: 
case 'i': 

input 0; 
break; 

/* x: delete current character ♦/ 
case KEY_DC: 
case 'x': 

delchO ; 
break; 

/* o: open up a new line and enter input mode */ 
case KEY__IL: 
case 'o': 

move(++row, col = 0); 
insert In () ; 
input (); 
break; 

/♦ d: delete current line */ 
case KEY_DL: 
case 'd': 

deleteln(); 
break; 

/* ^L: redraw screen */ 
case KEY_CLEAR: 
case CTRL('L' ) : 

wrefresh(curscr); 
break; 

/* w: write and quit */ 
case 'w': 

return; 

/* q: quit without writing */ 
case 'q': 

endwin () ; 
exit (2); 
default: 

flash 0; 
break; 

} 
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The highlight Program This program illustrates a use of the routine at tr set (). highlight reads a 

text file and uses embedded escape sequences to control attributes. \U turns on 
underlining, \B turns on bold, and \N restores the default output attributes. 

Note the first call to scrollok (), a routine that we have not previously dis¬ 
cussed (see curses(3V)). This routine allows the terminal to scroll if the file is 
longer than one screen. When an attempt is made to draw past the bottom of the 
screen, scrollok () automatically scrolls the terminal up a line and calls 
refresh (). 


* Insert mode: accept characters and insert them. 

+ End with or EIC 
*/ 

input () 

{ 

int c; 
standout () ; 

mvaddstr(LINES - 1, COLS - 20, "INPUT MODE"); 

standend () ; 

move(row, col); 

refresh 0; 

for (;;) 

{ 

c = getch () ; 

if (c == CTRL('D') I I c == KEY_EIC) 
break; 
insch(c); 
move (row, -f+col) ; 
refresh 0; 

} 

move(LINES - 1, COLS - 20); 
clrtoeol0; 
move(row, col); 
refresh 0; 


* highlight: a program to turn \U, \B, and 

* \N sequences into highlighted 

* output, allowing words to be 

* displayed underlined or in bold. 

*/ 

#include <stdio.h> 
tinclude <curses.h> 

main (argc, argv) 
int argc; 
char **argv; 

FILE *fd; 
int c, c2; 

void exit () , perror () ; 
if (argc != 2) 
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{ 

fprintf(stderr, "Usage: highlight file\n"); 
exit (1) ; 

} 

fd = fopen(argv[l], "r"); 
if (fd == NULL) 

{ 

perror(argv[1]); 
exit (2); 

} 

initscr () ; 

scrollok(stdscr, TRUE); 
nonl () ; 

while ((c = getc(fd)) != EOF) 

{ 

if (c — 'W) 

{ 

c2 = getc(fd); 
switch (c2) 

{ 

case 'B': 

attrset(A_BOLD); 
continue; 
case 'U': 

attrset(A_UNDERLINE); 
continue; 
case 'N': 

attrset(0) ; 
continue; 

} 

addch(c); 
addch(c2); 

} 

else 

addch(c); 

} 

fclose(fd); 
refresh 0; 
endwin(); 
exit (0); 

} 
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The scatter Program This program takes the first LINES - 1 lines of characters from the standard 

input and displays the characters on a terminal screen in a random order. For this 
program to work properly, the input file should not contain tabs or non-printing 
characters. 




/* 

* The scatter program. 

*/ 

tinclude <curses. h> 

#include <sys/types .h> 

extern time_t time(); 

#define MAXLINES 120 
#define MAXCOLS 160 

char s[MAXLINES][MAXCOLS]; /* Screen Array ♦/ 

int T[MAXLINES][MAXCOLS]; /* Tag Array ^ Keeps track of ♦ 

* the number of characters * 

* printed and their positions. ♦/ 

main () 

{ 

register int row = 0,col = 0; 
register int c; 
int char___count = 0; 
time__t t; 

void exitO, srand(); 
initscr(); 

for(row = 0;row < MAXLINES;row++) 

for(col = 0;col < MAXCOLS;col++) 
s[row][col]'; 

col = row == 0; 

/* Read screen in */ 

while ((c=getchar0) 1 = EOF && row < LINES ) { 
if(c != '\n') 

{ 

/* Place char in screen array */ 
s[row][col++] = c; 
if(c !* ' ') 

char_count++; 

} 

else 

{ 

col == 0; 
row++; 

} 

} 

time(&t) ; /* Seed the random number generator */ 

srand((unsigned)t) ; 

while (char_count) 

{ 

row = randO % LINES; 

col « (randO » 2) % COLS; 

if (T[row][col] != 1 && s[row][col] != ' ') 
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{ 

move(row, col); 
addch(s[row][col]); 
T[row][col] * 1; 
char_count—; 
refresh(); 

} 

} 

endwin () ; 
exit (0); 


The show Program 


show pages through a file, showing one screen of its contents each time you 
depress the space bar. The program calls cbr eak () so that you can depress the 
space bar without having to hit return; it calls noecho () to prevent the space 
from echoing on the screen. The nonl () routine, which we have not previously 
discussed, is called to enable more cursor optimization. The idlok () routine, 
which we also have not discussed, is called to allow insert and delete line. (See 
curses(3V) for more information about these routines). Also notice that 
clrtoeol 0 and clrtobot () are called. 


By creating an input file for show made up of screen-sized (about 24 lines) 
pages, each varying slightly from the previous page, nearly any exercise for a 


curses () program can be created. This type of input file is called a show 


script. 


#include <curses.h> 
tinclude <signal.h> 

main(argc, argv) 
int argc; 
char *argv[]; 

{ 

FILE ♦fd; 

char linebuf[BUFSIZ]; 
int line; 

void doneO, perrorO , exit () ; 

if (argc != 2) 

{ 

fprintf(stderr, "usage: %s file\n", argv[0]); 
exit (1); 

} 

if ((fd=fopen(argv[l], "r") ) — NULL) 

{ 

perror(argv[1]); 
exit (2); 

} 

signal(SIGINT, done); 

initscr(); 
noecho(); 
cbreak () ; 
nonl 0 ; 
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The two Program This program pages through a file, writing one page to the terminal from which 

the program is invoked and the next page to the terminal named on the command 
line. It then waits for a space to be typed on either terminal and writes the next 
page to the terminal at which the space is typed. 

two is just a simple example of a two-terminal curses program. It does not 
handle notification; instead, it requires the name and type of the second terminal 
on the command line. As written, the command "sleep 10 0 0 0 0" must be 
typed at the second terminal to put it to sleep while the program runs, and the 
user of the first terminal must have both read and write permission on the second 
terminal. 

/* ... . * . . . " ' . ■ I ■ ■ . I i,.i I . .. . — 

> 

#include <curses.h> 

#include <signal.h> 

SCREEN *me, *you; 

SCREEN *set__term 0 ; 

FILE ♦fd, *fdyou; 
char linebuf[512]; 

main(argc, argv) 
int argc; 
char ♦*argv; 

{ 

void done () , exit () ; 
unsigned sleep(); 
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char *getenv(); 
int c; 


} 


if (argc !* 4) 

{ 

fprintf(stderr, "Usage: two othertty “otherttytype inputfile\n”); 
exit (1); 

fd ” fopen(argv[3], "r"); 
fdyou = fopen(argv[1], ”w+”); 

signal(SIGINT, done); /* die gracefully */ 


me = newterm(getenv ("TERM”) , stdout, stdin) ; /* initialize my tty 

you = newterm(argv[2] , fdyou, fdyou); /* Initialize the other terminal */ 


set__term (me) ; 
noecho () ; 
cbreak () ; 
nonl 0 ; 

nodelay(stdscr. 


/* Set modes for my terminal ♦/ 
/* turn off tty echo ♦/ 

/+ enter cbreak mode */ 

/♦ Allow linefeed */ 

TRUE); /* No hang on input ♦/ 


set_term(you); /♦ Set modes for other terminal ♦/ 

noecho(); 
cbreak () ; 
nonl () ; 

nodelay(stdscr,TRUE); 


/* Dump first screen full on my terminal */ 
dump__page (me) ; 

/♦ Dump second screen full on the other terminal ♦/ 
dump_page(you); 

for (;;) /♦ for each screen full ♦/ 

{ 

set_term(me) ; 
c * getch () ; 

if (c == 'q') /♦ wait for user to read it */ 

done (); 
if (c == ' ') 
dump_page (me) ; 

set_term(you); 
c ^ getch 0; 

if (c ~ 'q') /♦ wait for user to read it ♦/ 

done (); 

if (c == ' ') 

dump__page (you) ; 

sleep(1); 

} 


dump_page(term) 
SCREEN *term; 

{ 

int line; 


set__term(term) ; 
move (0, 0); 

for (line = 0; line < LINES - 1; line++) { 

if (fgets(linebuf, sizeof linebuf, fd) == NULL) { 
clrtobot 0; 
done (); 

} 
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mvaddstr(line, 0, linebuf); 

} 

standout () ; 

mvprintw(LINES - 1 , 0 , ”—More—' 
standend () ; 


•); 


refresh (); 


/* sync screen */ 


} 

/* 

* Clean up and exit. 

*/ 

void done () 

{ 

/♦ Clean up first terminal */ 
set__term (you) ; 


/♦ to lower left corner */ 


/♦ clear bottom line */ 

/* flush out everything */ 
/♦ curses cleanup */ 


move(LINES - 1,0); 

clrtoeol 0 ; 
refresh 0; 
endwin(); 

/* Clean up second terminal */ 
set_term(me) ; 

move(LINES — 1,0); /♦ to lower left corner +/ 

clrtoeol 0; /* clear bottom line */ 

refreshO; /* flush out everything */ 

endwin (); /♦ curses cleanup */ 

exit (0); 






The window Program This example program demonstrates the use of multiple windows. The main 

display is kept in stdscr. When you want to put something other than what is 
in stdscr on the physical terminal screen temporarily, a new window is created 
covering part of the screen. A call to wrefresh {) for that window causes it to 
be written over the stdscr image on the terminal screen. Calling refresh () 
on stdscr results in the original window being redrawn on the screen. Note 
the calls to the touchwin () routine (which we have not discussed — see 
curses(3V)) that occur before writing out a window over an existing window 
on the terminal screen. This routine prevents screen optimization in a curses 
program. If you have trouble refreshing a new window that overlaps an old win¬ 
dow, it may be necessary to call touchwin () for the new window to get it 
completely written out. 


tinclude <curses.h> 
WINDOW *cmdwin; 
main () 

{ 

int i, c; 
char buf[120]; 
void exitO; 

initscr(); 
nonl(); 
noecho () ; 


# sun 
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cbreak () ; 

cmdwin = newwin(3, COLS, 0, 0);/* top 3 lines ♦/ 
for (i = 0; i < LINES; i++) 

mvprintw(i, 0, "This is line %d of stdscr", i); 

for (;;) 

{ 

refresh(); 
c = getch () ; 
switch (c) 

{ 

case ' c': /* Enter command from keyboard ♦/ 

werase(cmdwin); 

wprintw(cmdwin, "Enter command:"); 
wmove(cmdwin, 2, 0) ; 
for (i = 0; i < COLS; i++) 
waddch (cmdwin, 
wmove(cmdwin, 1, 0); 
touchwin(cmdwin); 
wrefresh(cmdwin); 
wgetstr(cmdwin, buf); 
touchwin(stdscr); 

/* 

* The command is now in buf. 

* It should be processed here. 

*/ 

case 'q': 

endwin () ; 
exit (0); 

} 


1 


} 
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This appendix contains a summary of the individual SCCS commands. The 
user-level interface to SCCS is described in chapter 8 of this manual. In the 
unlikely event that you need to use the ‘raw’ commands of SCCS, here they are. 
Be aware that the commands described here do not make any assumptions about 
where the SCCS-files are — you must spell them out in excruciating detail. The 
individual SCCS tools are not easy to use, but they do provide extremely close 
control over the SCCS database files. Of particular interest are the numbering of 
versions and branch versions, the 1. file, which gives a description of what deltas 
were used on a get, and certain other SCCS commands. 

The following topics are covered here: 

□ The scheme used to identify versions of text kept in an SCCS file. 

□ Basic information needed for day-to-day use of SCCS commands, including 
a discussion of the more useful arguments. 

□ Protection and auditing of SCCS files, including the differences between the 
use of SCCS by individual users on one hand, and groups of users on the 
other. 


A.I. Low Level SCCS For In this section, we present some basic concepts of SCCS. Examples are fragments 
Beginners of terminal sessions, with what you type shown in bold typewriter font, 

and what the terminal displays shown in typewriter font. 

Note that all the SCCS commands described here live inthe/usr/sccs direc¬ 
tory, so you must either include the directory pathname explicitly when using 
SCCS commands, or include it in your shell’s search path. This chapter assumes 
that you included have /usr/sccsin your path. 


Terminology 



Each SCCS file is composed of one or more sets of changes applied to the null 
(empty) version of the file; each set of changes usually depends on all previous 
sets. Each set of changes is called a delta and is assigned a name called the SCCS 
Identification string (SID). 

The SID is composed of at most four components; for now let’s focus on only the 
first two: the “release” and “level” numbers. Each set of changes to a file is 
named 'release . /eve/’; hence, the first delta is called ‘1.1’, the second ‘1.2’, the 
third ‘1.3’, and so on. The release number can also be changed, allowing, for 
example, deltas ‘2.1’, ‘3.19’, etc. A change in the release number can be used. 
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perhaps, to indicate a major update to the file, or to signal the start of a new 
round of related updates. 

Each delta of an SCCS file defines a particular version of the file. For example, 
delta 1.5 defines the version of the SCCS file obtained by applying the changes 
that constitute deltas 1.1,1.2, etc., up to and including delta 1.5 itself, in that 
order, to the null (empty) version of the. file. A. 16.2. 

A.2. SCCS File Numbering You can think of the deltas applied to an SCCS file as the nodes of a tree; the root 

Conventions is the initial version of the file. The root delta (node) is normally named ‘1.1’ 

and successor deltas (nodes) are named ‘1.2’, ‘1.3’, etc. We have already dis¬ 
cussed these two components of the names of the deltas, the ‘release’ and ‘level’ 
numbers; and you have seen that normal naming of successor deltas proceeds by 
incrementing the level number, which is performed automatically by SCCS when¬ 
ever a delta is made. In addition, you have seen how to change the release 
number when making a delta, to indicate that a major change to the file is being 
made. The new release number applies to all successor deltas, unless it is 
specifically changed again. Thus, the evolution of a particular file may be 
represented as in Figure A-1. 



Figure A-1 Evolution of an SCCS File 

We can call this structure the ‘trunk’ of the SCCS tree. It represents the normal 
sequential development of an SCCS file, in which changes that are part of any 
given delta are dependent upon all the preceding deltas. 

Branches However, there are situations when a branch is needed on the tree: when changes 

applied as part of a given delta are not dependent upon all previous deltas. As an 
example, consider a program which is in production use at version 1.3, and for 
which development work on release 2 is already in progress. Thus, release 2 may 
already have some deltas, precisely as shown in Figure 1. Assume that a produc¬ 
tion user reports a problem in version 1.3 which cannot wait until release 2 to be 

♦ sun 
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repaired. The changes necessary to repair the trouble will be applied as a delta to 
version 1.3 (the version in production use). This creates a new version that will 
then be released to the user, but will not affect the changes being applied for 
release 2 (that is, deltas 1.4, 2.1, 2.2, etc.). 

The new delta is a node on a ‘branch’ of the tree, and its name consists of four 
components: the release and level numbers, as with tmnk deltas, plus the 
‘branch’ and ‘sequence’ numbere. Its SID thus appears as: 

'release . level. branch . sequence. The branch number is assigned to each 
branch that is a descendant of a particular trunk delta; the first such branch is 1, 
the next one 2, and so on. The sequence number is assigned, in order, to each 
delta on a particular branch. Thus, 1.3.1.2 identifies the second delta of the first 



Figure A-2 Tree Structure with Branch Deltas 

The concept of branching may be extended to any delta in the tree; the naming of 
the resulting deltas proceeds in the manner just illustrated. 

Two observations are of importance with regard to naming deltas. First, the 
names of trunk deltas contain exactly two components, and the names of branch 
deltas contain exaedy four components. Second, the first two components of the 
name of a branch delta are always those of the ancestral trunk delta, and the 
branch component is assigned in the order of creation of the branch, indepen¬ 
dently of its location relative to the trunk delta. Thus, a branch delta may always 
be identified as such from its name. Although the ancestral trunk delta may be 
identified from the branch delta’s name, it is not possible to determine the entire 
path leading from the trunk delta to the branch delta. For example, if delta 1.3 
has one branch emanating from it, all deltas on that branch will be named 1.3.1.«. 
If a delta on this branch then has another branch emanating from it, all deltas on 
the new branch will be named 1.3.2.rt (see Figure A-3. The only information that 
may be derived from the name of delta 1.3.2.2 is that it is the second 
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chronological delta on the second chronological branch whose trunk ancestor is 
delta 1.3. In particular, it is not possible to determine from the name of delta 
1.3.2.2 all of the deltas between it and its trunk ancestor (1.3). 


s.file 


file 



Figure A-3 Extending the Branching Concept 

It is obvious that the concept of branch deltas allows the generation of arbitrarily 
complex tree structures. Although this capability has been provided for certain 
specialized uses, it is strongly recommended that the SCCS tree be kept as simple 
as possible, because comprehension of its structure becomes extremely difficult 
as the tree becomes more complex. 


A.3. Summary of SCCS 
Commands 


Here is a summary of all the SCCS commands and their major functions: 

admin Creates SCCS files and applies changes to parameters of SCCS files, 
admin is described in section A.5. 

cdc Changes the commentary associated with a delta, cdc is described 

in section A.6. 

comb Combines two or more consecutive deltas of an SCCS file into a sin¬ 
gle delta, comb is described in section A.7. 

delta Applies changes (deltas) to the text of SCCS files; that is, delta 
creates new versions, delta is described in section A.8. 

get Retrieves versions of SCCS files, get is described in section A.9. 

help Explains SCCS commands and diagnostic messages, help is 
described in section A. 10. 

pr s Prints portions of an SCCS file in user-specified format, pr s is 

described in section A. 11. 
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A.4. sees eommand 
eonventions 


Options 


Filename Arguments 


Flags 


rmdel Removes a delta from an SCCS file; useful for removing deltas that 
were created by mistake, rmdel is described in section A. 12. 

scesdiff 

Shows the differences between any two versions of an SCCS file, 
scesdif f is described in section A. 14. 

val Validates an SCCS file, val is described in section A. 16. 

what Searches file(s) for all occurrences of a special pattern and prints 

what follows it. what is useful in finding identifying information 
inserted by get. what is described in section 

This section discusses the conventions and rules that apply to SCCS commands. 
These rules and conventions are generally applicable to all SCCS commands, 
except as indicated below. 

SCCS commands, like most SunOS commands, accept options and filename argu¬ 
ments. 


Options begin with a minus sign (-), followed by a lower-case alphabetic charac¬ 
ter, and, in some cases, a value. Options modify actions of commands on which 
they are specified. 

Filename arguments (which may be names of files and/or directories) specify the 
file(s) that the given SCCS command is to process; naming a directory is 
equivalent to naming all the SCCS files within the directory. Non-SCCS files and 
unreadable files in the named directories are silently ignored. 

In general, file arguments may not begin with a minus sign. However, if the 
name ’ (a lone minus sign) is specified as an argument to a command, the com¬ 
mand reads the standard input for lines and takes each line as the name of an 
SCCS file to be processed. The standard input is read until end-of-file. This 
feature is often used in pipelines with, for example, the f ind(l) or ls(l) com¬ 
mands. Again, names of non-SCCS files and of unreadable files are silently 
ignored. 

Options specified for a given command apply to all filename arguments. Options 
are processed before any file arguments; therefore the placement of options is 
arbitrary, that is, options may be interspersed with file arguments. File argu¬ 
ments, however, are processed left to right. 

Somewhat different argument conventions apply to the help, what, 
scesdif f, and val commands. 

Certain actions of various SCCS conimands are modified hy flags embedded in 
the text of SCCS files. Some of these flags are discussed below. For a complete 
description of all such flags, see admin. 
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Real/Effective User The distinction between the real user (see pas swd(l)) and the effective user ID 

is of concern in discussing various actions of SCCS commands. For the present, 
it is assumed that both the real user and the effective user are one and the same, 
that is, the user who is logged into the system. 

All SCCS commands that modify an SCCS file do so by writing a temporary copy, 
called the x. file, to ensure that the SCCS file will not be damaged if processing 
terminates abnormally. The name of the x. file is formed by replacing the ‘s. ’ 
of the SCCS-file name with ‘x. ’. When processing is complete, the old SCCS file 
is removed and the x. file is renamed to take its place. The x. file is created in 
the directory containing the SCCS file, given the same permission mode (see 
chmod(l)), and is owned by the effective user. 

To prevent simultaneous updates to an SCCS file, commands that modify SCCS 
files create a lock file, called the z . file, whose (formed by replacing the ‘s . ’ 
with ‘z . ’). The z . file contains the process ID number of the command that 
creates it, and its existence is an indication to other commands that that SCCS file 
is being updated. Thus, other commands that modify SCCS files will not process 
an SCCS file if the corresponding z . file exists. The z . file is created with mode 
444 (read-only) in the directory containing the SCCS file, and is owned by the 
effective user. The z . file exists only for the duration of the execution of the 
command that creates it. In general, users can ignore x. files and z . files; they 
may be useful in the event of system crashes or similar situations. 

Diagnostics SCCS commands direct their diagnostic responses to the standard error file. 

SCCS diagnostics generally look like this: 

ERROR [filename]: message text (code) 

The code in parentheses may be used as an argument to ahelptoobtain 

If the SCCS command detects a fatal error during the processing of a file it ter¬ 
minates processing of that file and proceeds with the next file in the series, if 
more than one file has been named. 

admin creates new SCCS files and changes parameters of existing ones. Options 
and SCCS file names may appear in any order on the admin command line. 

SCCS file names must begin with the characters ‘s. ’. A named file is created if 
it doesn’t exist already, and its parameters are initialized according to the 
specified options. Any parameter not initialized by an option is assigned a 
default value. If a named file does exist, parameters corresponding to specified 
options are changed, and other parameters are left as is. 

f ^ 

admin [ -n ] [-i [-rre/] [-t [name]] [-fy?ag [^ag-va/]]... 

[ -dflag [flag-val '\'\... [ -alogin ]...[ -elogin ]... [ -m ] 

[ -y [comment]] [ -h ] [ -z ] filename ... 

V_:_/ 


A.5. admin — Create and 
Administer SCCS 
Files 


Back-up Files Created During 
Processing 


If a directory is named, admin behaves as though each file in the directory were 
specified as a named file, except that non-SCCS files (last component of the path 
name does not begin with s .) and unreadable files are silently ignored. A name 
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of - means the standard input — each line of the standard input is taken as the 
name of an SCCS file to be processed. Again, non-SCCS files and unreadable 
files are silently ignored. 

admin Options Options are explained as though only one named file is to be processed, since 

options apply independentiy to each named file. 

Creating a New SCCS File -n A new SCCS file is being created. 

Initial Text -x[name^ 

Initial text: file name contains the text of a new SCCS file. The text is the 
first delta of the file — see -r option for delta numbering scheme. If name 
is omitted, the text is obtained from the standard input. Omitting the -i 
option altogether creates an empty SCCS file. You can only create one SCCS 
file with an admin -i command. Creating more than one SCCS file 
with a single admin command requires that they be created empty, in which 
case the-i option should be omitted. Note that the -i option implies the 
-n option. 

Initial Release -r rel 

Initial release: the release into which the initial delta is inserted, -r may 
be used only if the -i option is also used. The initial delta is inserted into 
release 1 if the -r option is not used. The level of the initial delta is always 
1, and initial deltas are named 1.1 by default. 

Descriptive Text -t [name ] 

Descriptive text: The file name contains descriptive text for the SCCS file. 
The descriptive text file name must he supplied when creating a new SCCS 
file (either or both -n and -i options) and the -t option is used. In the 
case of existing SCCS files: 1) a -t option without a file name removes 
descriptive text (if any) currently in the SCCS file, and 2) a -t option with a 
file name replaces the descriptive text currently in the SCCS file with any 
text in the named file. 

Set a Flag -f flag 

Set flag: specifies a flag, and, possibly, a value for the flag, to be placed in 
the SCCS file. Several -f options may be supplied on a single admin 
command line. Flags and their values appear in the FLAGS section after 
this list of options. 

Delete a Flag -dflag 

Delete flag from an SCCS file. The -d option may be specified only when 
processing existing SCCS files. Several -d options may be supplied on a 
single admin command. See the FLAGS section below. 

Unlock Releases -Hist 

Unlock the specifiedof releases. Seethe -f option for a description of 
the 1 flag and the syntax of a list. 
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Add Login Name -a login 

Add login name, or numerical group ID, to the list of users who may make 
deltas (changes) to the SCCS file. A group ID is equivalent to specifying all 
login names common to that group ID. Several -a options may appear on a 
single admin command line. As many/ogms, or numerical group IDs, as 
desired may be oh the list simultaneously. If the list of users is empty, any¬ 
one may add deltas. 

Erase Login Name -e login 

Erase login name, or numerical group HD, from the list of users allowed to 
make deltas (changes) to the SCCS file. Specifying a group ED is equivalent 
to specifying all login names common to that group ID. Several -e options 
may be used on a single admin command line. 

Insert Comment Text -y [ comment ] 

The comment text is inserted into the SCCS file as a comment for the initial 
delta in a manner identical to that of delta. If the -y option is omitted, a 
default comment line is inserted in the form: 

date and time created yy/mm/dd hh-.mm-.ss by login 

The -y option is valid only if the -i and/or -n options are specified (that 
is, a new SCCS file is being created). 

Modification List -m [ mrlist ] 

The list of Modification Requests (MR) numbers is inserted into the SCCS 
file as the reason for creating the initial delta in a manner identical to 
delta. The v flag must be set and the MR numbers are validated if the v 
flag has a value (the name of an MR number validation program). Diagnos¬ 
tics are displayed if the v flag is not set or MR validation fails. 

Check Structures of SCCS File -h Check the structure of the SCCS file (see sccsfile{5)), and compare a newly 

computed check-sum (the sum of all the characters in the SCCS file except 
those in the first line) with the check-sum that is stored in the first line of the 
SCCS file. 

The -h option inhibits writing on the file, so that it nullifies tlK effect of any 
other options supplied, and is, therefore, only meaningful when processing 
existing files. 

Recompute Checksum - z recompute the SCCS file check-sum and store it in the first line of the SCCS 

file (see -h, above). 

Using the -z option on a truly corrupted file may prevent future detection of 
the cormption. 
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Flags In SCCS Files 


Branch Deltas can be Created 


The list below is a description of iht flags which may appear as arguments to the 
-f (set flags) and -d (delete flags) options. 

b When set, the -b option can be used on a get command to create branch 
deltas. 


Highest Retrievable Release 


Lowest Retrievable Release 


Default Delta Number 


No ID Keywords Fatal Error 


Encoded Binary File 


Allow Concurrent Edits 


c ceil 

The highest release (ceiling) which may be retrieved by a get command for 
editing. The ceiling is a number less than or equal to 9999. The default 
value for an imspecified c flag is 9999. 

t floor 

The lowest release (floor) which may be retrieved by a get command for 
editing. The floor is a number greater than 0 but less than 9999. The default 
value for an unspecified f flag is 1. 

dSID 

The default delta number (ID) to be used by a get command. 

i Treats the ‘No id keywords (ge6)’ message issued by get or delta as a 
fatal error. In the absence of the i flag, the message is only a warning. The 
message is displayed if no SCCS identification keywords (see get) are 
found in the text retrieved or stored in the SCCS file. 

e 1 

If the e flag appears with a 1 argument, the file is an encoded (see 
uuencode(lC) representation of a binary data file. 

j Concurrent get commands for editing may apply to the same SID of an 
SCCS file. This allows multiple concurrent updates to the same version of 
the SCCS file. 


Locked Releases 1 list 

A list of locked releases to which deltas can no longer be made. A 
get -e fails when applied against one ofthese locked releases. The list 
has the following syntax: 

< list > ::= < range > | < list > , < range > 

< range > RELEASE NUMBER | a 


The character a in the list is equivalent to specifying all releases for the 
named sees file. 


Create Null Deltas 


n 


The delta commatid creates a ‘null’ delta in each release (if any) being 
skipped when a delta is made in a new release. For example, releases 3 and 
4 are skipped when making delta 5.1 after delta 2.7. These null deltas serve 
as ‘anchor points’ so that branch deltas may be created froni them later. If 
the n flag is absent from the SCCS file, skipped releases will be non-existent 
in the SCCS file, preventing branch deltas from being created from them in 
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the future. 
qtext 

text is defined by the user. The text is substituted for all occurrences of the 
%Q% keyword in SCCS file text retrieved by get. 

Module Name m module 

Module name of the SCCS file substituted for all occurrences of the %M% 
keyword in SCCS file text retrieved by get. If the m flag is not specified, the 
value assigned is the name of the SCCS file with the leading s . removed. 

Module Type t type 

Type of module in the SCCS file substituted for all occurrences of % Y% key¬ 
word in SCCS file text retrieved by get. 

Validity Checking Program v [program ] 

Validity checking program: delta prompts for Modification Request (MR) 
numbers as the reason for creating a delta. The optional program specifies 
the name of an MR number validity checking program (see delta). If this 
flag is set when creating an SCCS file, the -m option must also be used even 
if its value is null. 

Files Used The last component of all SCCS file names must be of the form s .filename. 

New SCCS files are given mode 444 (see chmod). Write permission in the per¬ 
tinent directory is, of course, required to create a file. All writing done by 
admin is to a temporary x. file, called x .filename, (see get(l)), created with 
mode 444 if the admin command is creating a new SCCS file, or with the same 
mode as the SCCS file if it exists. After successful execution of admin, the SCCS 
file is removed (if it exists), and the x. file is renamed with the name of the SCCS 
file. This ensures that changes are made to the SCCS file only if no errors 
occurred. 

It is recommended that directories containing SCCS files be mode 755 and that 
SCCS files themselves be mode 444. The mode of the directories allows only the 
owner to modify SCCS files contained in the directories. The mode of the SCCS 
files prevents any modification at all except by SCCS commands. 

If it should be necessary to patch an SCCS file for any reason, the mode may be 
changed to 644 by the owner allowing use of a text editor. Care must be takenl 
The edited file should always be processed by an admin -h to check for corr¬ 
uption followed by an admin -z to generate a proper check-sum. Another 
admin -h is recommended to ensure the SCCS file is valid. 

admin also uses a transient lock file (called z .filename, to prevent simultaneous 
updates to the SCCS file by different users. See get for further information. 



Revision A of 9 May 1988 



Appendix A — SCCS Low-Level Commands 363 


Examples of Using admin Suppose you have a file called lang that contains a list of programming 

languages: 


/ - ---- -—--—-- 

hermes^ cat lang 






FORTRAN 


COBOL 


Algol 


l:ierine3% 



--- --- 


We wish to give SCCS custody of ‘lang’ by using admin (which administers 
SCCS files) to create an SCCS file and initialize delta 1.1. To do so, we use 
admin as shown, and admin responds with a message: 


/---—--- 

N 

hermes% admin -ilang s.lang 


No id keywords (cm7) 


herines% 



_ 7 


All SCCS files must have names that begin with ‘s.’, hence, ‘s.lang’. The -i 
option, together with its value ‘lang’, indicates that admin is to create a new 
SCCS file and initialize it with the contents of the file ‘lang’. This initial version 
is a set of changes applied to the null SCCS file; it is delta 1.1. 

The message is a warning message (which may also be issued by other SCCS 
commands) that you can ignore for the present. 

Remove the file ‘lang’ now — it can easily be reconstructed with the get com¬ 
mand, described in section 


Inserting Commentary for the 
Initial Delta 


You can use the -y and -m options with admin, just as with delta, to insert 
initial descriptive commentary and/or MR numbers when an SCCS file is created. 
If you don’t use -y to comment, admin automatically inserts a comment line 
of the form: 

date and time created yy/mm/dd hh:MM:ss by logname 


If you want to supply MR numbers (-m option), the v flag must also be set 
(using the -f option described below). The v flag simply determines whether 
or not MR numbers must be supplied when using any SCCS command that 
modifies a delta commentary in the SCCS file (see sccsfile(5)). Thus: 


hermes% admin. —ifirst -mmrnuml —fv s.abc 


Note that the -y and -m options are only effective if a new SCCS file is being 
created. 


Initializing and Modifying The portion of the SCCS file reserved for descriptive text may be initialized or 

SCCS File Parameters changed through the use of the -t option. The descriptive text is intended as a 

summary of the contents and purpose of the SCCS file; actually its contents and 
length are up to you. 
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When an SCCS file is being created and the -t option is supplied, it must be fol¬ 
lowed by the name of a file from which the descriptive text is to be taken. For 
example, the command 



specifies that the descriptive text is to be taken from file ‘desc’. 


When processing an existing SCCS file, the -t option specifies that the descrip¬ 
tive text (if any) currently in the file is to be replaced with the text in the named 
file. Thus: 



specifies that the descriptive text of the SCCS file is to be replaced by the con¬ 
tents of‘desc’. Omitting the filename after the -t option removes the descrip¬ 
tive text from the SCCS file: 



The flags — see the section entitled Descriptive Text — of an SCCS file may be 
initialized and changed with the -f (flag) option, or may be deleted with the -d 
(delete) option. The flags of an SCCS file direct certain actions of the various 
commands. See admin for a description of all the flags. For example, the i flag 
specifies that the warning message stating there are no ID keywords contained in 
the SCCS file should be treated as an error, and the d (default SID) flag specifies 
the default version of the SCCS file to be retrieved by the get command. The 
-f option sets a flag and, possibly, sets its value. For example: 



sets the i flag and the m (module name) flag. The value ‘modname’ specified 
for the m flag is the value that the get command uses to replace the %M% ID key¬ 
word. In the absence of the m flag, the name of the g-file is used as the replace¬ 
ment for the %M% ID keyword. Note that several -f options may be supplied on 
a single admin command, and that -f options may be supplied whether the 
command is creating a new SCCS file or processing an existing one. 



The -d option deletes a flag from an SCCS file, and may only be specified when 
processing an existing file. As an example, the command: 



removes the m flag from the SCCS file. Several -d options may be supplied on a 
single admin command, and may be interspersed with -f options. 


SCCS files contain a list {user list) of login names and/or group IDs of users who 
are allowed to create deltas. This list is normally empty, implying that anyone 
may create deltas. To add login names and/or group IDs to the list, use the f' \ 

admin command with the -a option. For example: / 


»SUI1 
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f ... . ......--I--- - . . .. . . ^ 

herines% admin -awendy -aalison ~al234 s.abc 

adds the login names ‘wendy’ and ‘alison’ and the group ID ‘1234’ to the list. 
The -a option may be used whether admin is creating a new SCCS file or pro¬ 
cessing an existing one, and may appear several times. The -e option is used in 
an analogous manner if one wishes to remove (‘erase’) login names or group IDs 
from the list. A.9. 

A.6. ede ■— Change Delta 

Coninientary ede changes the delta commentary, for the SID specified by the —r option, of 

each named SCCS file. 

r — -------^ 

ede -rS/D [-m[ mrlist ] ] [ -y [ comment ] ] filename ... 


Delta commentary is defined to be the Modification Request (MR) and comment 
information normally specified via the delta command (-m and -y options). 

If a directory is named, ede behaves as though each file in the directory were 
specified as a named file, except that non-SCCS files (last component of the path 
name does not begin with s .) and unreadable files are silently ignored. If a 
name of - is given, the standard input is read (see the NOTES below) each line of 
the standard input is taken to be the name of an SCCS file to be processed. 

Arguments to ede, which may appear in any order, consist of options and file 
names. 

ede Options All the described options apply independently to each named file: 

SID Identification String -rSID 

Specifies the SCCS /Dentification string of a delta for which the delta com¬ 
mentary is to be changed. 

MR List -m[mrlisi\ 

If the SCCS file has the v flag set (see admin), a list of MR numbers to be 
added and/or deleted in the delta commentary of the SID specified by the -r 
option may be supplied. A null MR list has no effect. 

MR entries are added to the list of MRs in the same manner as that of 
delta. To delete an MR, precede the MR number with the character ! (see 
EXAMPLES. If the MR to be deleted is currently in the list of MRs, it is 
removed and changed into a “comment” line. A list of all deleted MRs is 
placed in the comment section of the delta commentary and preceded by a 
comment line stating that they were deleted. 

If -m is not used and the standard input is a terminal, the prompt mrs ? is 
issued on the standard output before the standard input is read; if the stan¬ 
dard input is not a terminal, no prompt is issued. The MRs ? prompt always 
precedes the comments? prompt (see -y option). 

MRS in a list are separated by blanks and/or tab characters. An unescaped 
new-line character terminates the MR list 


Asun 
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Note that if the v flag has a value (see admin), it is taken to be the name of 
a program (or shell procedure) which validates the correctness of the MR 
numbers. If a non-zero exit status is returned from the MR number valida¬ 
tion program, cdc terminates and the delta commentary remains unchanged. 

Comment Text -Ylcomment] 

Arbitrary text used to replace the comment{s) already existing for the delta 
specified by the -r option. The previous comments are kept and preceded 
by a comment line stating that they were changed. A null comment has no 
effect. 

If -y is not specified and the standard input is a terminal, the prompt com¬ 
ments ? is issued on the standard output before the standard input is read; if 
the standard input is not a terminal, no prompt is issued. An unescaped 
new-line character terminates the comment text. 


Examples of Using cdc 


henries^ cdC -rl.6 -ni"bl78-12345 !bl77-54321 bl79-00001" -ytrouble s.file 

._______ -J 

adds bl78-12345 and bl79-00001 to the MR list, removes bl77-54321 from the 
MR list, and adds the comment trouble to delta 1.6 of s. file. 
- 

hennes% cdc -rl.6 s.file 

MRS? !bl77-54321 bl78-12345 bl79-00001 

comments? trouble 

does the same thing. 

NOTE If sees file names are supplied to the cdc command via the standard input (- on 

the command line), then the -m and -y options must also be used. 


Files Used 


x.file (see delta) 

z , file (see delta) 


A.7. comb — Combine 
sees Deltas 


comb generates a Bourne Shell procedure which, when run, will reconstruct the 
given sees files. 

-N 

comb [ -o ] [ -s ] [ -p SID ] [ -c list ] filename . . . 

— 


If a directory is named, comb behaves as though each file in the directory were 
specified as a named file, except that non-Sees files (last component of the path 
name does not begin with s .) and unreadable files are silendy ignored. If a 
name of - is given, the standard input is read; each line of the standard input is 
taken to be the name of an SCCS file to be processed; non-SCCS files and unread¬ 
able files are silently ignored. The generated shell procedure is written on the 
standard output. 
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comb Options 

Options are explained as though only one named file is to be processed, but the 
effects of any option apply independently to each named file. 

ID String 

-psro 

The SCCS /Dentification string (SID) of the oldest delta to be preserved. 

All older deltas are discarded in the reconstmcted file. 

Preserve List 

-c list 

A/wt of deltas to be preserved. All other deltas are discarded. See get for 
the syntax of a list. 

Access at Release 

-o For each get -e generated, the reconstmcted file is accessed at the release 
of the delta to be created. In the absence of the -o option, the reconstmcted 
file is accessed at the most recent ancestor. Use of the -o option may 
decrease the size of the reconstmcted SCCS file. It may also alter the shape 
of the delta tree of the original file. 

Generate Report 

-s Generate a shell procedure which, when run, will produce a report giving, 
for each file: the file name, size (in blocks) after combining, original size 
(also in blocks), and percentage change computed by: 

100 * (original combined) / original 

It is recommended that before any SCCS files are actually combined, you 
should use this option to determine exactly how much space is saved by the 
combining process. 

If no options are specified, comb preserves only leaf deltas and the minimal 
number of ancestors needed to preserve the tree. 

Files Used 

S . COMB 

The name of the reconstmcted SCCS file, 
comb????? 

Temporary. 

Limitations of the comb 
Command 

comb may rearrange the shape of the tree of deltas. It may not save any space; 
in fact, it is possible for the reconstmcted file to actually be larger than the origi¬ 
nal. 

A.8. delta — Make a 
Delta 

delta permanently introduces into the named SCCS file changes that were made 
to the file retrieved by get (called the ,g-file or generated file). 

delta [ -r SID ] [ -s ] 

[ -n ] [ -g list ] [ -m [ mrlist ] ] [ -y [ comment ] ] [ -p ] filename .. . 

-—---—-—_/ 


delta makes a delta to each named SCCS file. If a directory is named, delta 
behaves as though each file in the directory were specified as a named file, except 
that non-SCCS files (last component of the path name does not begin with s.) 
and unreadable files are silently ignored. If a name of - is given, the standard 
input is read (see WARNINGS; each line of the standard input is taken to be the 
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delta Options 
Delta Number 


No Report 


Retain q-file 


Ignore List 


MR Number 


Comment Text 


name of an SCCS file to be processed. 

delta may issue prompts on the standard output depending upon certain 
options specified and flags (see admin) that may be present in the SCCS file (see 
-m and -y options below). 

Options apply independently to each named file. 

-xSID 

Uniquely identifies which delta is to be made to the SCCS file. The use of 
this option is necessary only if two or more outstanding get’s for editing (get 
-e) on the same SCCS file were done by the same person (login name). The 
SID value specified with the -r option can be either the SID specified on the 
get command line or the SID to be made as reported by the get command 
(see get). A diagnostic results if the specified SID is ambiguous, or, if 
necessary and omitted on the command line. 

- s Do not display the created delta’s ID, number of lines inserted, deleted and 
unchanged in the SCCS file. 

-n Retain the edited g-file which is normally removed at completion of delta 
processing. 

-g list 

Specifies a list of deltas to be ignored when the file is accessed at the change 
level (ID) created by this delta. See get for the definition of list. 

-m [ mrlist ] 

If the SCCS file has the v flag set (see admin), a Modification Request (MR) 
number mustht supplied as the reason for creating the new delta. 

If -m is not used and the standard input is a terminal, the prompt mrs? is 
issued on the standard output before the standard input is read; if the stan¬ 
dard input is not a terminal, no prompt is issued. The MRs? prompt always 
precedes the comments? prompt (see -y option). 

MR’s in a list are separated by blanks and/or tab characters. An unescaped 
new-line character terminates the MR list. 

Note that if the v flag has a value (see admin), it is taken to be the name of 
a program (or shell procedure) which will validate the correctness of the MR 
numbers. If a non-zero exit status is returned from MR number validation 
program, delta terminates (it is assumed that the MR numbers were not all 
valid). 

-y [ comment] 

Arbitrary text to describe the reason for making the delta. A null string is 
considered a valid comment. 

If -y is not specified and the standard input is a terminal, the prompt com¬ 
ments ? is issued on the standard output before the standard input is read; 
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Display Dijferences 

Files Used 



Examples of Using 



if the standard input is not a terminal, no prompt is issued. An unescaped 
new-line character terminates the comment text. 

-p Display (on the standard output) the SCCS file differences before and after 
the delta is applied in a dif f format. 

g-file Existed before the execution of delta; removed after completion 
of delta. 

p. file Existed before the execution of delta; may exist after completion 

of delta. 

q. file Created during the execution of delta; removed after completion 

of delta. 

X. file Created during the execution of delta; renamed to SCCS file after 
completion of delta. 

z . file Created during the execution of delta; removed during the execu¬ 
tion of delta. 

d. file Created during the execution of delta; removed after completion 
of delta. 

/bin/diff 

Program to compute differences between the “gotten” file and the 
g-file. 

NOTE Lines beginning with an ASCII SOH character (binary 001) cannot be placed in 

the SCCS file unless the SOH is escaped. This character has special meaning to 
SCCS (see sccsf ile(5)) and will cause an error. 

NOTE A get of many SCCS files, followed by a delta, of those files, should be avoided 
when the get generates a large amount of data. Instead, multiple get/delta 
sequences should be used. 

NOTE If the standard input (-) is specified on the delta command line, the -m.(if 

necessary) and -y options must also be present. Omission of these options is an 
error. 

delta To record the changes that were applied to ‘lang’ within the SCCS file, use the 

delta command, delta asks for comments describing the change, and you 
respond with a description of why the changes were made: 



A 

hermes% delta s.lang 


comments? added SNOBOL and Ratfor 


More messages from delta ~ see below 


harmes% 





delta then reads the p. file and determines what changes were made to the file 
lang. delta does this by doing its own get to retrieve the original version, 
and then applying dif f (1) to the original version and the edited version. When 
the changes to ‘lang’ have been stored in ‘s.lang’, the dialogue with delta 
looks like: 


♦sun 
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r . ' ■ . 

.^ 

hennes% delta s>lang 


comments? added SNOBOL and Ratfor 




2 inserted 


0 deleted 


5 unchanged 


hermes% 



j 


The number ‘1.2’ is the name of the delta just created, and the next three lines are 
a summary of the changes made to ‘s.lang’. 


More Notes on delta delta does a series of checks before creating the delta: 

1. Searches the p. file for an entry containing the user’s login name, because 
the user who retrieved the g-file must be the one who creates the delta, 
delta displays an error message if the entry is not found. Note that if the 
login name of the user appears in more than one entry (that is, the same user 
did a get -e more than once on the same SCCS file), the -r option must 
be used with delta to specify an SID that uniquely identifies the p.file 
entry^®. 

2. Performs the same permission checks as get -e. 

If these checks succeed, delta compares the g-file (via dif f(l)) with its own, 
temporary copy of the g-file as it was before editing, to determine what has been 
changed. This temporary copy of the g-file is called the d. file (its name is 
formed by replacing the s . of the SCCS file name with d.); delta retrieves it 
by doing its own get at the SID specified in the p. file entry. If you would like 
to see the results of delta’s dif f, use the -p option to display it on standard 
output. 


In practice, the most common use of delta is: 



If your standard output is a terminal, delta replies: ‘comments?’. You may 
now type a response — usually a description of why the delta is being made — 
of up to 512 characters, terminating with a newline character. Newline charac¬ 
ters not intended to terminate the response should be preceded by ‘\’. 

If the SCCS file has a v flag, delta asks for ‘MRs?’ before prompting for 
‘comments?’ (again, this prompt is printed only if the standard output is a termi¬ 
nal). Enter numbers, separated by blanks and/or tabs, and terminate your 
response with a newline character. 

If you want to enter commentary (comments and/or MR numbers) directly on the 
command line, use the -y and/or -m options, respectively. For example: 

The SID specified may be either the SID retrieved by ge t, or the SID de 11 a is to create. 

In a tightly controlled environment, one would expect deltas to be created only as a result of some trouble 
report, change request, trouble ticket, etc. (collectively called here Modification Requests, or MRs) and would 
think it desirable or necessary to record such MR number(s) within each delta. 



microsystems 


Revision A of 9 May 1988 



Appendix A — SCCS Low-Level Commands 371 


hennes% delta -y"descriptive comment" -m"mrnuml mrnuin2" s.abc 


inserts the ‘descriptive comment’ and the MR numbers ‘mrnuml’ and ‘mmum2’ 
without prompting or reading from, standard input, -m can only be used if the 
SCCS file has a vflag. These options are useful when delta is executed from 
within a Shell procedure. 

The commentary (comments and/or MR numbers), whether solicited by delta 
or supplied via options, is recorded as part of the entry for the delta being 
created, and applies to all SCCS files processed by the same invocation of 
delta. Thus if delta is used with more than one file argument, and the first 
file named has a v flag, all files named must have this flag. Similarly, if the first 
file named does not have this flag, then none of the files named may have it. 

Only files conforming to these mles are processed. 

After the prompts for commentary, and before any other output, delta displays: 
No id keywords (cm?) 

if it finds no ID keywords in the edited g-file while making a delta. If there were 
any ID keywords in the SCCS file, this might mean one of two things. The key¬ 
words may have been replaced by their values (if a get without the -e option 
was used to retrieve the g-file). Or, the keywords may have been accidentally 
deleted or changed while editing the g-file. Of course, the file may never have 
had any ID keywords. In any case, it is left up to you to decide whether any 
action is necessary, but the delta is made regardless (unless there is an i flag in 
the SCCS file, which makes this a fatal error and kills the delta). 

When processing is complete, delta displays a message containing the SID of 
the created delta (obtained from the p. file entry), and the counts of lines 
inserted, deleted, and left unchanged. Thus, a typical message might be: 

1.4 

14 inserted 
7 deleted 
345 unchanged 

The reported counts may not agree with your sense of changes made; there are a 
number of ways to describe a set of such changes, especially if lines are moved 
around in the g-file, and delta may describe the set differently than you. 
However, the total number of lines of the new delta (the number inserted plus the 
number left unchanged) should agree with the number of lines in the edited g- 
file. 

After processing of an SCCS file is complete, the corresponding p. file entry is 
removed from the p. f ile"^^. If there is only one entry in the p. file, the p. file 
itself is removed. 


All updates to the p .file are made to a temporary copy, the q -file, whose use is similar to the use of the 
X. file described above. 
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In addition, delta removes the edited g-file, unless the -n option is specified. 
Thus: 

__ . 

hermes% delta . -n s.abc 

V_____ J 

keeps the g-file upon completion of processing. 

The -s (silent) option suppresses all output that is normally directed to the stan¬ 
dard output, except the initial prompts for commentary. If you use -s with -y 
(and, possibly, -m), delta neither reads standard input nor writes to standard 
output. 

get generates an ASCII text file from each named SCCS file according to the 
specified option. Arguments may be specified in any order, options apply to all 
named SCCS files. If a directory is named, get behaves as though each file in 
the directory were specified as a named file, except that non-SCCS files (last com¬ 
ponent of the path name does not begin with s.) and unreadable files are silently 
ignored. If a name of - is given, the standard input is read; each line of the stan¬ 
dard input is taken to be the name of an SCCS file to be processed. Again, non- 
SCCS files and umeadable files are silently ignored. 

- \ 

get [ -rSID ] [ -ccutoff ] [ -Hist ] [ -xlist ] [ -aseq-no. ] [ -k ] [ -e ] 

[ -l[p]] [ -P ] 1 -m ] [ -n ] [ -s ] [ -b ] [ -g ] [ -t ] jilename ... 


A.9. get — Get Version of 
SCCS File 


The generated text is normally written into a file called the g-file (whose name is 
derived from the SCCS file name by simply removing the leading s.; see also 
FILES, below). 

get Options Options are explained below as though only one SCCS file is to be processed, but 

the effects of any option argument applies independently to each named file. 

ID String -r SID 

The string (ID) of the version (delta) of an SCCS file to be retrieved. Table 1 
below shows, for the most useful cases, what version of an SCCS file is 
retrieved (as well as the ID of the version to be eventually created by delta 
if the -e option is also used), as a function of the SID specified. 

Cutojf -ccutoff 

Cutoff date-time, in the form: YY[ MM[ DD[ HH[ MM[ SS ] ] ] ] ] 

No changes (deltas) to the SCCS file which were created after the specified 
cutoff date-time are included in the generated ASCII text file. Units omitted 
from the date-time default to their maximum possible values; that is, - 
c 7 5 0 2 is equivalent to-c750228235959. Any number of non-numeric 
characters may separate the various 2 digit pieces of the date-time. 
This feature allows one to specify a cutoff in the form: -c77/2/2 
9:22:25. Note that this implies that one may use the %E% and %U% 
identification keywords. 
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Get for Editing -e This get is for editing or making a change (delta) to the SCCS file via a sub¬ 

sequent use of delta. A get -e applied to a particular version (ID) of 
the SCCS file prevents further get -e commands on the same SID until 
delta is run or the j O'oint edit) flag is set in the SCCS file (see admin). 
Concurrent use of get -e for different IDs is always allowed. 

If the g-file generated by a get -e is accidentally ruined in the process 
of editing it, it may be regenerated by re-running a get with the -k option 
in place of the -e option. 

SCCS file protection specified via the ceiling, floor, and authorized user list 
stored in the SCCS file (see admin) are enforced when the -e option is 
used. 


New Branch 


Include List 


-b Used with the -e option to indicate that the new delta should have an SID in 
a new branch as shown in Table 1. This option is ignored if the b flag is not 
present in the file (see admin) or if the retrieved delta is not a Ituidelta. 
A leaf delta is one that has no successors on the SCCS file tme. 

NOTE A branch delta may always be created from a non-leaf delta. 

-X list 

A list of deltas to be included (forced to be applied) in the creation of the 
generated file. The list has the following syntax: 

< list >::=< range > \ < list > , < range > 

< range > ::= ID | ID-ID 


ID, the SCCS Identification of a delta, may be in any form shown in the ‘ID 
Specified’ column of Table 1. Partial IDs are interpreted as shown in the ‘ID 
Retrieved’ column of Table 1. 

Exclude List -x list 

A list of deltas to be excluded (forced not to be applied) in the creation of 
the generated file. See the - i option for the list format. 

Don’t Expand ID Keywords -k Do not replace identification keywords (see below) in the retrieved text by 

their value. The -k option is implied by the -e option. 

Write Delta Summary -1 [ p ] 

Write a delta summary into an 1. file. If -Ip is used, the delta summary is 
written on the standard output and the 1. file is not created. See FILES for 
the format of the 1. file. 


Write Text to Standard Output -p Write the text retrieved from the SCCS file to the standard output. No g-file 

is created. All output which normally goes to the standard output goes to the 
standard error file instead, unless the -s option is used, in which case it 
disappears. 
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Suppress All Output 


Show delta IDs 


Show Module Names 


Don’t Retrieve Text 


Access Top Delta 


Delta Sequence Number 


-s Suppress all output normally written on the standard output. However, fatal 
error messages (which always go to the standard error file) remain unaf¬ 
fected. 

-m Precede each text line retrieved from the SCCS file with the ID of the delta 
that inserted the text line in the SCCS file. The format is: ID, followed by a 
horizontal tab, followed by the text line. 

-n Precede each generated text line with the %M% identification keyword value 
(see below). The format is: %M% value, followed by a horizontal tab, fol¬ 
lowed by the text line. When both the -m and -n options are used, the for¬ 
mat is: %M% value, followed by a horizontal tab, followed by the -m option 
generated format. 

-g Do not actually retrieve text from the SCCS file. It is primarily used to gen¬ 
erate an 1. file, or to verily the existence of a particular ID. 

-t Access the most recently created (‘top’) delta in a given release (for exam¬ 
ple, -rl), or release and level (for example, -rl.2). 

-a seq-no. 

The delta sequence number of the SCCS file delta (version) to be retrieved 
(see sccsfile{5)). This option is used by the comb command; it is not a gen¬ 
erally useful option, and users should not use it. If both the -r and -a 
options are specified, the -a option is used. Care should be taken when 
using the -a option in conjunction with the -e option, as the SID of the 
delta to be created may not be what one expects. The -r option can be 
used with the -a and -e options to control the naming of the SID of the 
delta to be created. 

For each file processed, get responds (on the standard output) with the SID 
being accessed and with the number of lines retrieved from the SCCS file. 

If the -e option is used, the SID of the delta to be made appears after the SID 
accessed and before the number of lines generated. If there is more than one 
named file or if a directory or standard input is named, each file name is printed 
(preceded by a new-line) before it is processed. If the -i option is used included 
deltas are listed following the notation ‘Included’; if the -x option is used, 
excluded deltas are listed following the notation ‘Excluded’. 
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T able A-1 Determination of SCCS Identification Strin g 


SID* 

Specified 

-b Option 
Usedt 

Other 

Conditions 

SID 

Retrieved 

SID of Delta 
to be Created 

nonej 

no 

R defaults to mR 

mR.mL 

mR.(mL+l) 

none$ 

yes 

R defaults to mR 

inR.niL 

niR.mL.(mB+l).l 

R 

no 

R > mR 

mR.mL 

R.l*** 

R 

no 

R = niR 

mR.mL 

mR.(mL+l) 

R 

yes 

R > mR 

mR.mL 

mR.mL.(niB+l).l 

R 

yes 

II 

mR.mL 

niR.mL.(niB-hl).l 

R 

— 

R < mR and 

R does not exist 

hR.mL** 

hR.niL.(mB-i-l).l 

R 


Trunk succ.# 
in release > R 
and R exists 

R.mL 

R.mL.(niB+l).l 

R.L 

no 

No trunk succ. 

R.L 

R.(L+1) 

R.L 

yes 

No trunk succ. 

R.L 

R.L.(mB-i-l).l 

R.L 

— 

Trunk succ. 
in release > R 

R.L 

R.L.(mB+l).l 

R.L.B 

no 

No branch succ. 

R.L.B.mS 

R.L.B.(mS+l) 

R.L.B 

yes 

No branch succ. 

R.L.B .mS 

R.L.(mB+l).l 

R.L.B.S 

no 

No branch succ. 

R.L.B.S 

R.L.B.(S+1) 

R.L.B.S 

yes 

No branch succ. 

R.L.B.S 

R.L.(mB+l).l 

R.L.B.S 

— 

Branch succ. 

R.L.B.S 

R.L.(mB+l).l 


* ‘R’, ‘L’, ‘B’, and ‘S’ are the ‘release’, ‘level’, ‘branch’, and ‘sequence’ com¬ 
ponents of the SID, respectively; ‘m’ means ‘maximum’. Thus, for example, 
‘R,mL’ means ‘the maximum level number within release R’; 
‘R.L.(mB-i-l).r means ‘the first sequence number on the new branch (that is, 
maximum branch number plus one) of level L within release R’. Note that if 
the SID specified is of the form ‘R.L’, ‘R.L.B’, or ‘R.L.B.S’, each of the 
specified components must exist. 

** ‘hR’ is the highest existing release that is lower than the specified, nonex¬ 
istent, release R. 

*** 

Forces creation of the first delta in a new release. 

# Successor. 

t The -b option is effective only if the b flag (see admin) is present in the 
file. An entry of - means ‘irrelevant’. 

t This case applies if the d (default SID) flag is not present in the file. If the 
d flag is present in the file, the SID obtained from the d flag is interpreted as 
if it had been specified on the command line. Thus, one of the other cases in 
this table applies. 


#sun 
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Identification Keywords 


J 


When you generate a g-file to be used for compilation, it is useful and informa¬ 
tive to record the date and time of creation, the version retrieved, the module’s 
name, etc., within the g-file, so that this information appears in a load module 
when one is eventually created. SCCS provides a convenient mechanism for 
doing this automatically. Identification (ID) keywords appearing anywhere in the 
generated file are replaced by appropriate values according to the definitions of 
these ID keywords. 

The format of an ID keyword is an upper-case letter enclosed by percent signs 
(%). For example, %I% is an ID keyword that is replaced by the SID of the 
retrieved version of a file. Similarly, %H% is an ID keyword for the current date 
(in the form ‘mm/dd/yy’), and %M% is the name of the g-file. 

Thus, using get on an SCCS file that contains the C declaration: 

char identification [ ] = "■%M% %I% %H%"; 

gives (for example) the following: 

char identification [ ] = "modulename 2.3 03/17/83"; 
If there are no ID keywords in the text, get might display: 


Nq id keywords (cm7) 


hermes 


/ 

This message is normally treated as a warning by get. 

However, if an i flag is present in the SCCS file, it is treated as an error— see 
section A.8 for further information. 


Identification Keywords 


Keyword 

Value 


%M% 

Module name: either the value of the m flag in the file (see admin), 
or if absent, the name of the SCCS file with the leading s . removed. 

%I% 

SCCS identification (ID) (%R%. %L%. %B% . %S%) of the retrieved 
text. 


%R% 

Release. 


Q, T Q, 

O J-I "o 

Level. 


%B% 

Branch. 


9- C 2- 
o O o 

Sequence. 


%D% 

Current date (YY/MM/DD). 


%H% 

Current date (MM/DD/YY). 


%T% 

Current time (HH:MM;SS). 


%E% 

Date newest applied delta was created (YY/MM/DD). 


%G% 

Date newest applied delta was created (MM/DD/YY). 


%U% 

Time newest applied delta was created (HH:MM:SS). 


%Y% 

Module type: value of the t flag in the SCCS file (see admin). 


%F% 

SCCS file name. 
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Table A-2 Identification Keywords—Continued 


Keyword 

Value 

%P% 

Fully qualified SCCS file name. 

%Q% 

The value of the q flag in the file (see admin). 

%c% 

Current line number. This keyword is intended for identifying mes¬ 
sages output by the program such as ‘this shouldn’t have happened’ 
type errors. It is not intended to be used on every line to provide 
sequence numbers. 

9 - *7 9 - 
o o 

The 4-character string @ (#) recognizable by what. 

%w% 

A shorthand notation for constmeting what strings for program 
files. %W% = %Z%%M%<rafc>%l% 

%A% 

Another shorthand notation for constructing 
what strings for nonstandard program files. 

%A% = %Z%%Y% %M% %I%%Z% 


Retrieving Different Versions You can retrieve versions other than the default version of an SCCS file by using 

various options. Normally, the default version is the most recent delta of the 
highest-numbered release on the trunk of the SCCS file tree. However, if the 
SCCS file being processed has a d (default SDD) flag, the SID specified as the 
value of this flag is used as a default. The default SID is interpreted in exactly 
the same way as the value supplied with the -r option of get. 


The -r option specifies an SID to be retrieved, in which case the d (default SID) 
flag (if any) is ignored. For example, to retrieve version 1.3 of file ‘s.abc’, type: 



xXv:;!;:;*;*:*;*;':-;*:;-:- 

herines% get -rl.3 s.abc 




64 lines 


hermes% 



j 


A branch delta may be retrieved in the same way: 



hermesi get -rl.5.2.3 s.abc 

iiiiiiiiiiiiiii* 



hermes% 


< _ 


J 


When a two- or four-component SID is specified as a value for the -r option (as 
above) and the particular version does not exist in the SCCS file, an error message 
results. 


If you omit the level number of the SID, get retrieves the trunk delta with the 
highest level number within the given release, if the given release exists: 


#sun 
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- -- - - —™—..— ..— % 

hemes% get -rS s.abc 

213 lines 

herm6s% ' , 

_ ] ___> 


get retrieved delta 3.7, the highest level trunk delta in release 3. If the given 
release does not exist, get goes to the next-highest existing release, and retrieves 
the trunk delta with the highest level number. For example, if release 9 does not 
exist in file ‘s.abc’, and release 7 is actually the highest-numbered release below 
9, then get would generate: 


f - 


bermes% get -r9 s.abc 




420 lines 


hermes% 


V 

j 


indicating that trunk delta 7.6 is the latest version of file ‘s.abc’ below release 9. 


Similarly, if you omit the sequence number of an SID, as in: 


f - 


hermes% get -r4,3*2 s.abc 


4,3.2.8 


09 lines 


berines% 


. .. 



get retrieves the branch delta with the highest sequence number on the given 
branch, if it exists. If the given branch does not exist, an error message results. 


The -t option retrieves the latest (‘top’) version in a particular release (that is, 
when no -r option is supplied, or when its value is simply a release number). 
The latest version is defined as that delta which was produced most recently, 
independent of its location on the SCCS file tree. Thus, if the most recent delta in 
release 3 is trunk delta 3.5, doing a get -t on release 3 produces: 


/ -— 


hermes% get -r3 -t s.abc 




5$ lines 


hemes% 


. . . 

. ^ 


However, if branch delta 3.2.1.5 were the latest delta (created after delta 3.5), the 
same command produces: 


r - 

A 

hermes% get -r3 -t a,abc 


3.2.1,5 


46 linee 


hermes% 





Retrieving to Make Changes Specifying the -e option to the get command indicates the intent to make a 

delta sometime later, and, as such, its use is restricted. If the -e option is 
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present, get checks the following things: 

1. The user list, the list of login names and/or group IDs of users allowed 
to make deltas, to determine if the login name or group ID of the user 
executing get is on that list. Note that a null (empty) user list behaves 
as if it contained all possible login names. 

2. That the release (R) of the version being retrieved satisfies the relation: 

floor < R < ceiling 

to determine if the release being accessed is a protected release. The 
floor and ceiling are specified as flags in the SCCS file. 

3. That the release (R) is not locked against editing. The lock is specified 
as a flag in the SCCS file. 

4. Whether or not multiple concurrent edits are allowed for the SCCS file 
as specified by the j flag in the SCCS file. Multiple concurrent edits are 
described in the section entitled Concurrent Edits of the Same SID . 

get terminates processing of the corresponding SCCS file if any of the first three 
conditions fails. 

If the above checks succeed, get with the -e option creates a g-file in the 
current directory with mode 644 (readable by everyone, writable only by the 
owner) owned by the real user. 

get terminates with an error if a writable g-file already exists — this is to 
prevent inadvertent destruction of a g-file that already exists and is being edited 
for the purpose of making a delta. 

ID keywords appearing in the g-file are not substituted by get when the -e 
option is specified, because the generated g-file is to be subsequently used to 
create another delta, and replacement of ID keywords would permanently change 
them within the SCCS file. In view of this, get does not check for the presence 
of ID keywords within the g-file, so that the message: ‘No id keywords (cm?)’ is 
never displayed when get is invoked with the -e option. 


In addition, a get with the -e option creates (or updates) a p. file, for passing 
information to the delta command. Let’s look at an example of get -e: 



The message indicates that get has retrieved version 1.3, which has 67 lines; the 
version delta will create is version 1.4. 


If the -r and/or -t options are used together with the -e option, the version 
retrieved for editing is as specified by the -r and/or -t options. 

The options -i and -x may be used to specify a list of deltas to be included 
and excluded, respectively, by get. See get for the syntax of such a list. 
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‘Including a delta’ forces the changes that constitute the particular delta to be 
included in the retrieved version — this is useful for applying the same changes 
to more than one version of the SCCS file. ‘Excluding a delta’ forces it not to be 
applied. This is useful for undoing the effects of a previous delta in the version 
of the SCCS file to be created. 

Whenever deltas are included or excluded, get checks for possible interference 
between such deltas and those deltas that are normally used in retrieving the par¬ 
ticular version of the SCCS file. Two deltas can interfere, for example, when 
each one changes the same line of the retrieved g-file. Any interference is indi¬ 
cated by a warning that displays the range of lines within the retrieved g-file in 
which the problem may exist. The user is expected to examine the g-file to 
determine whether a problem actually exists, and to take whatever corrective 
measures are deemed necessary. 

NOTE The -i and -x options should be used with extreme care. 

The -k option to get can be used to regenerate a g-file that may have been 
accidentally removed or ruined after executing get with the -e option, or to 
simply generate a g-file in which the replacement of ID keywords has been 
suppressed. Thus, a g-file generated by the -k option is identical to one pro¬ 
duced by get executed with the -e option. However, no processing related to 
the p. file takes place. 

The ability to retrieve different versions of an SCCS file allows a number of del¬ 
tas to be ‘in progress’ at any given time. In general, several people may simul¬ 
taneously edit the same SCCS file provided they are editing dijferent versions of 
that file. This is the situation we discuss in this section. However, there is a pro¬ 
vision for multiple concurrent edits, so that more than one person can edit the 
same version — see the section entitled Concurrent Edits of the Same SID. 

The p. file — created via a get -e command — is named by replacing the ‘s.’ 
in die SCCS file name with ‘p.’. The p. file is created in the directory containing 
the SCCS file, is given mode 644 (readable by everyone, writable only by the 
owner), and is owned by the effective user. The p. file contains the following 
information for each delta that is still ‘in progress’:'*^ 

□ The SID of the retrieved version. 

□ The SID that will be given to the new delta when it is created. 

□ The login name of the real user executing get. 

The first execution of get -e creates the p. file for the corresponding SCCS 
file. Subsequent executions only update the p. file by inserting a line containing 
the above information. Before inserting this line, however, get perfoms two 
checks. First, it searches the entries in the p. file for an SID which matches that 
of the requested version, to make sure that the requested version has not already 
been retrieved. Secondly, get determines whether or not multiple concurrent 
edits are allowed. If the requested version has been retrieved and multiple 

Other information may be present, but is not of concern here. See get for further discussion. 


Concurrent Edits of Dijferent 
SIDs 
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concurrent edits are not allowed, an error message results. Otherwise, the user is 
informed that other deltas are in progress, and processing continues. 

It is important to note that the various executions of get should be carried out 
from different directories. Otherwise, only the first use of get will succeed; 
since subsequent gets would attempt to overwrite a writable g-file, they pro¬ 
duce an sees error condition. In practice, this problem does not arise: normally 
such multiple executions are performed by different users^^ from different work¬ 
ing directories. 

Table A-1 shows, for the most useful cases, what version of an seeS file is 
retrieved by get, as well as the SID of the version to be eventually created by 
delta, as a function of the SID specified to get. 

Concurrent Edits of the Same Normally, gets for editing (-e option specified) cannot operate concurrently 

on the same SID. Usually delta must be used before another get -eon the 
same SID. However, multiple concurrent edits (two or more successive 
get -e commands based on the same retrieved SID) are allowed if the j fiag 
is set in the SCCS file. Thus: 



hemes% get -e s.abc 



new delta 1.2 






hermes% 



may be immediately followed by: 


----- 

\ 

hermes% get -e s.abc 


iiiiiiiiiiiiiiiiiiil® 


new delta 1.1.1.1 


5 lines 


hermes% 





without an intervening use of delta. In this case, a delta command 
corresponding to the first get produces delta 1.2 (assuming 1.1 is the latest 
(most recent) trunk delta), and the delta command corresponding to the 
second get produces delta 1.1.1.1. 

Options that Affect Output When the -p option is specified, get writes the retrieved text to the standard 

output, rather than to a g-file. In addition, all output normally directed to the 
standard output (such as the SID of the version retrieved and the number of lines 
retrieved) is directed instead to the diagnostic output. This may be used, for 
example, to create g-files with arbitrary names: 

r -—-—-^ 

hermes% get -p s.abc > arbitrary-filename 


** See the section entitled Protection for a discussion of how different users can use SCCS commands on the 
same files. 
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The -s option suppresses all output that is normally directed to the standard 
output. Thus, the SID of the retrieved version, the number of lines retrieved, and 
so on, do not appear on the standard output, -s does not affect messages directed 
to the diagnostic output. - s is often used in conjunction with the -p option to 
‘pipe’ the output of get, as in: 

---^ 

hermes% get —p —s s.abc | nroff 


A get -g verifies the existence of a particular SID in an SCCS file but does not 
actually retrieve the text. This may be useful in a number of ways. For example. 


hermes% get -g -r4.3 s>abc 

----- 

\^_ 

_ 


displays the specified SID if it exists in the SCCS file, and generates an error mes¬ 
sage if it doesn’t, -g can also be used to regenerate a p. file that has been des¬ 
troyed: 


hemea% get -e —g s.abc 

V_^ 

get used with the -1 option creates an 1. file, which is named by replacing the 
‘s.’ of the SCCS file name with ‘1.’. This file is created in the current directory, 
with mode 444 (read-only), and is owned by the real user. It contains a table 
(format described in get) showing which deltas were used in constructing a par¬ 
ticular version of the SCCS file. For example: 



generates an 1. file showing which deltas were applied to retrieve version 2.3 of 
the SCCS file. Specifying a value of ‘p’ with the -1 option, as in: 


- -— 

hermes% get -J-p -r2.3 s.abc 

\ _ 

sends the generated output to the standard output rather than to the 1. file. 
Note that the -g option may be used with the -1 option to suppress the actual 
text retrieval. 

The -m option identifies the origin of each change applied to an SCCS file, -m 
tags each line of the generated g-file with the SID of the delta it came from. The 
SID precedes the line, and is separated from the text by a tab character. 

When the -n option is specified, each line of the generated g-file is preceded by 
the value of the %M% ID keyword and a tab character. The -n option is most 
often used in a pipeline with grep(l). 

For example, to find all lines that match a given pattern in the latest version of 
each SCCS file in a directory: 

herines% get —p —n —s directory | g^cep pattern 

-:_^_/ 

If both the -mand -n options are specified, each line of the generated g-file is 
preceded by the value of the %M% ID keyword and a tab (the effect of the -n 



^sun 

XT microsystems 


Revision A of 9 May 1988 







Appendix A — SCCS Low-Level Commands 383 


Files Used 


g-file 


1-file 


Format of Lines in the 1-file 


p.file 


option), followed by the line in the format produced by the -m option. 

Since using the -m option, the -n option, or both, modifies the contents of the 
g-file, such a g-file must not be used for creating a delta. Therefore, neither the 
-m nor the -n option may be used with the -e option. 

Several auxiliary files may be created by get. These files are known genetically 
as the g-file, 1. file, p. file, and z . file. The letter before the “dot” is called the 
tag. The current version, or ” g-file has no tag. An auxiliary file name is based 
on the format of the SCCS-file name: the last component of the SCCS-file name is 
of the form s. version-name, the auxiliary files are named by replacing the lead¬ 
ing s . with the tag. The g-file is an exception to this scheme: its name is 
derived by removing the s . prefix. For example, for s. xy z , c, the auxiliary 
file names would be xyz . c (g-file), 1. xyz . c, p. xyz . c, and z . xyz . c. 

The g-file, which contains the generated text, is created in the current directory 
(unless the -p option is used). A g-file is created in all cases, whether or not any 
lines of text were generated by the get. It is owned by the real user. If the -k 
option is used or implied its mode is 644; otherwise its mode is 444. Only the 
real user need have write permission in the current directory. 

The 1. file contains a table showing which deltas were applied in generating the 
retrieved text. The 1. file is created in the current directory if the -1 option is 
used; its mode is 444 and it is owned by the real user. Only the real user need 
have write permission in the current directory. 

Lines in the 1. file have the following format: 

a. A blank character if the delta was applied; * otherwise. 

b. A blank character if the delta was applied or wasn’t applied and ignored; 

* if the delta wasn’t applied and wasn’t ignored. 

c. A code indicating a ‘special’ reason why the delta was or was not applied: 

‘I’: Included. 

‘X’: Excluded. 

‘C’: Cut off (by a -c option). 

d. Blank. 

e. SCCS identification (ID). 

f. Tab character. 

g. Date and time (in the form YY/MM/DD HH:MM:SS) of creation. 

h. Blank. 

i. Login name of person who created delta. 

The comments and MR data follow on subsequent lines, indented one horizontal 
tab character. A blank line terminates each entry. 

The p. file passes information resulting from a get -e along to delta. Its 
contents are also used to prevent a subsequent execution of a get -e for the 
same SID until delta is executed or the joint edit flag, j, (see admin) is set in 
the SCCS file. The p.file is created in the directory containing the SCCS file and 
the effective user must have write permission in that directory. Its mode is 644 
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z-file 


Limitations of the get 
Command 


A.IO. help — Ask for 
sees Help 


Example of help 


and it is owned by the effective user. The format of the p.file is: the gotten ID, 
followed by a blank, followed by the SID that the new delta will have when it is 
made, followed by a blank, followed by the login name of the real user, followed 
by a blank, followed by the date-time the get was executed, followed by a blank 
and the -i option if it was present,, followed by a blank and the -x option if it 
was present, followed by a new-line. There can be an arbitrary number of lines 
in the p. file at any time; no two lines can have the same new delta ID. 

The z-file serves as a lock-out mechanism against simultaneous updates. Its con¬ 
tents are the binary (2 bytes) process ID of the command (that is, get) that 
created it. The z . file is created in the directory containing the SCCS file for the 
duration of get. The same protection restrictions as those for the p. file apply for 
the z-file. The z-file is created mode 444. 

If the effective user has write permission (either explicitly or implicitly) in the 
directory containing the SCCS files, but the real user doesn’t, only one file may 
be named when the -e option is used. 


help finds information to explain a message from a command or explain the use 
of a command. Zero or more arguments may be supplied. If no arguments are 
given, help prompts for one. 


r 


help [args] 


V 



The arguments may be either message numbers (which normally appear in 
parentheses following messages) or command names, of one of the following 
types: 

type 1 Begins with non-numerics, ends in numerics. The non-numeric prefix 
is usually an abbreviation for the program or set of routines which pro¬ 
duced the message (for example, ge 6, for message 6 from the get 
command). 

type 2 Does not contain numerics (as a command, such as get) 
type 3 Is all numeric (for example, 212) 

The response of the program is the explanatory information related to the argu¬ 
ment, if there is any. 

When all else fails, try help stuck. 

The following asks for help on the ge5 error message and information about the 
rmdel command: 
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herines% help geS rmdel 
"nonexistent sid" 

The specified sid does not exist in the 
given file. 

Check for typos. 

rmdel; 

rmdel -rSID name 

hermes% 


/usr/lib/help 

directory containing files of message text. 

Print sees pr s prints, on the standard output, parts or all of an SCCS file (see 

scesf ile(5)) in a user supplied format. If a directory is named, prs behaves 
as though each file in the directory were specified as a named file, except that 
non-SCCS files (last component of the path name does not begin with s.), and 
unreadable files are silently ignored. If a name of - is given, the standard input is 
read, in which case each line is taken to be the name of an SCCS file or directory 
to be processed; non-SCCS files and unreadable files are silently ignored. 

' ■ 

prs [ -d[ dataspec ] ] [ -r[ S/D ] ] [ -e ] [ -1 ] [ -a ] filename ... 


Files Used 


A.ll. prs — 
File 


prs Options 


Options apply independently to each named file. 


Output data specification -d [ dataspec ] 

Specifies the output data specification. The dataspec is a string consisting of 
SCCS file data keywords (see A. 11.2) interspersed with optional user sup¬ 
plied text. 


ID string -r[S/D] 

Specifies the SCCS /Dentification (ID) string of a delta for which informa¬ 
tion is desired. If no SID is specified, the SID of the most recently created 
delta is assumed. 


Information on earlier deltas -e Requests information for all deltas created earlier than and including the 

delta designated via the -r option. 

Information on later deltas -1 Requests information for all deltas created later than and including the delta 

designated via the -r option. 


Information for all deltas 


-a Requests printing of information for both removed, that is, delta type = R, 
(see rmdel) and existing, that is, delta type = D, deltas. If the -a option is 
not specified, information for existing deltas only is provided. 
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Data Keywords 


n 

/ 

In the absence of the -d options, pr s displays a default set of information con¬ 
sisting of: delta-type, release number and level number, date and time last 
changed, user-name of the person who changed the file, lines inserted, changed, 
and unchanged, the MR numbers, and the comments. 

Data keywords specity which parts of an SCCS file are to be retrieved and output. 

All parts of an SCCS file (see sccsf iie(5)) have an associated data keyword. 

There is no limit on the number of times a data keyword may appear in a 
dataspec. 

The information printed by prs consists of: 1) the user supplied text; and 2) 
appropriate values (extracted from the SCCS file) substituted for the recognized 
data keywords in the order of appearance in the dataspec. The format of a data 
keyword value is either Simple (S), in which keyword substitution is direct, or 
Multi-line (M), in which keyword substitution is followed by a carriage return. 

User supplied text is any text other than recognized data keywords. A tab is 
specified by \t and carriage retum/new-line is specified by \n. 

Table A-3 SCCS Files Data Keywords 


r\ 


n 


Keyword Data Item File Section Value Format 


Dt 

Delta information 

Delta Table 

See below* 

S 

DL 

Delta line statistics 

It 

:Li:/:Ld:/:Lu: 

S 

Li 

Lines inserted by Delta 

M 

nnnnn 

S 

Ld 

Lines deleted by Delta 

If 

nnnnn 

S 

Lu 

Lines unchanged by Delta 

M 

nnnnn 

S 

DT 

Delta type 

M 

D orR 

S 

I 


SCCS ID string (SID) 

M 

:R:.:L:.:B:.:S: 

S 

R 


Release number 

II 

nnnn 

s 

L 


Level number 

It 

nnnn 

s 

B 


Branch number 

tl 

nnnn 

s 

S 


Sequence number 

tf 

nnnn 

s 

D 


Date Delta created 

It 

:Dy:/:Dm:/:Dd: 

s 

:Dy 

Year Delta created 

It 

nn 

s 

:Dm 

Month Delta created 

If 

nn 

s 

:Dd 

Day Delta created 

It 

nn 

s 

:T 


Time Delta created 

It 

:Th:::Tm:::Ts: 

s 

:Th 

Hour Delta created 

It 

nn 

s 

:Tm 

Minutes Delta created 

•' 

nn 

s 

:Ts 

Seconds Delta created 

(1 

nn 

s 

:P 


Programmer who created Delta 

•• 

logname 

s 

:DS 

Delta sequence number 

It 

nnnn 

s 

:DP 

Predecessor Delta seq-no. 

II 

nnnn 

s 

:DI 

Seq-no. of deltas inch, 

•1 

:Dn:/:Dx:/:Dg: 

s 



excL, ignored 




:Dn 

Deltas included (seq #) 


:DS: :DS:... 

s 

:Dx 

Deltas excluded (seq #) 

II 

:DS: :DS:... 

s 

:Dg 

Deltas ignored (seq #) 

II 

:DS: :DS:... 

s 

:MR 

MR numbers for delta 

II 

text 

M 

:C 


Comments for delta 

II 

text 

M 
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Table A-3 SCCS Files Data Keywords — Continued 


Keyword Data Item 

File Section 

Value 

Format 

UN 

: User names 

User Names 

text 

M 

FL 

: Flag list 

Flags 

text 

M 

Y: 

Module type flag 

tf 

text 

S 

MF 

MR validation flag 

ff 

yes or no 

S 

MP 

MR validation pgm name 

ft 

text 

S 

KF 

Keyword error/waming flag 

ft 

yes or no 

S 

BF 

Branch flag 

ff 

yes or no 

S 

J: 

Joint edit flag 

ff 

yes or no 

s 

LK 

Locked releases 

ff 

:R:... 

s 

Q: 

User defined keyword 


text 

s 

M: 

Module name 

If 

text 

s 

FB 

Floor boundary 

If 

;R: 

s 

CB 

Ceiling boundary 

If 

:R: 

s 

Ds 

Default SID 

ff 

:I: 

s 

ND 

Null delta flag 


yes or no 

s 

FD 

File descriptive text 

Comments 

text 

M 

BD 

Body 

Body 

text 

M 

GB 

Gotten body 


text 

M 

W: 

A form of what{l) string 

N/A 

:Z::M:\t:I: 

S 

A: 

A form of what(l) string 

N/A 

:Z::Y: :M: :I::Z: 

S 

Z: 

what{l) string delimiter 

N/A 

@(#) 

S 

F: 

SCCS file name 

N/A 

text 

S 

PN: 

SCCS file path name 

N/A 

text 

s 


* :Dt: = :DT: :I: :D: :T: :P: :DS: :DP: 

Examples of Using prs 



may produce on the standard output: 


Users and/or user IDs for s.file are: 

xyz 

131 

abc 



may produce on the standard output: 


Newest delta for pgm main.c: 3.7 Created 77/12/1 By cas 


As a special case: 
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Files Used 

A. 12. rmdel —Remove 
Delta from SCCS 
File 


may produce on the standard output: 

D 1.1 77/12/1 00:00:00 cas 1 000000/00000/00000 
MRS: 

bl78-12345 
bl79-54321 
COMMENTS: 

this is the comment line tor s.file initial delta 

for each delta table entry of the ‘ ‘D’ ’ type. The only option argument allowed to 
be used with the special case is the -a option. 


/tmp/pr????? 


rmdel removes the delta specified by the SID from each named SCCS file. The 
delta to be removed must be the newest (most recent) delta in its branch in the 
delta chain of each named SCCS file. In addition, the SID specified must not be 
that of a version being edited for the purpose of making a delta (that is, if a 
p. file (see get) exists for the named SCCS file, the SID specified must not 
appear in any entry of the p. file). 

-N 

rmdel -rSID filename ... 

_ 


If a directory is named, rmdel behaves as though each file in the directory were 
specified as a named file, except that non-SCCS files (last component of the path 
name does not begin with s .) and unreadable files are silently ignored. If a 
name of - is given, the standard input is read; each line of the standard input is 
taken to be the name of an SCCS file to be processed; non-SCCS files and unread¬ 
able files are silently ignored. 

The exact permissions necessary to remove a delta are documented earlier in this 
manual under sees User’s—^SCCS Simply stated, they are either 1) if you make 
a delta you can remove it; or 2) if you own the file and directory you can remove 
a delta. 

The delta to be removed must be a ‘leaf delta; that is, it must be the latest (most 
recently created) delta on its branch or on the tmnk of the SCCS file tree. In Fig¬ 
ure A-3, only deltas 1.3.1.2,1.3.2.2, and 2.2 can be removed; once they are 
removed, deltas 1.3.2.1 and 2.1 can be removed, and so on. 

To remove a delta, the effective user must have write permission in the directoiy 
containing the SCCS file. In addition, the real user must either have created the 
delta being removed, or be the owner of the SCCS file and its directory. 

You must specify the complete SID of the delta to be removed, preceded by -r. 
The SID must have two components for a tmnk delta, and four components for a 
branch delta. Thus: 

hermes% rmdel -r2.3 s.abc 

...... ^ 

removes (tmnk) delta ‘2.3’ of the SCCS file. 
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Before removing the delta, rmdel checks the following things: 

1. the release number (R) of the given SID satisfies the relation: 


floor < R < ceiling 

2. the SID specified is not that of a version for which a get for editing has 
been executed and whose associated delta has not yet been made. 

3. the login name or group ID of the user either appears in the file’s user list or 
the user list is empty. 

4. the release specified cannot be locked against editing (that is, if the 1 flag is 
set (see admin), the release specified must not be contained in the list). 

If these conditions are satisfied, the delta is removed. Otherwise, processing is 
terminated. 

After the specified delta has been removed, its type indicator in the delta table of 
the sees file is changed from ‘D’ (delta) to ‘R’ (removed). 

Files Used 

x-file (see delta) 

z-file (see delta) 


A.13. sact — Display 
sees Editing 
Activity 



sact informs the user of any SeeS files which have had one or more get -e 
commands applied to them, that is, there are files out for editing, and deltas are 
pending. If a directory is named on the command line, sact behaves as though 
each file in the directory were specified as a named file, except that non-SCCS 
files and unreadable files are silently ignored. If a name of - is given, the stan¬ 
dard input is read with each line being taken as the name of an SCCS file to be 
processed. 



The output for each named file consists of five fields separated by spaces. 


Field 

Number 


Meaning 


specifies the SID of a delta that currently exists in the SCCS file to 

which changes will be made to make the new delta. 

specifies the SID for the new delta to be created. 

contains the logname of the user who will make the delta (that is, 

executed a get for editing). 

contains the date that get -e was executed. 

contains the time that get -e was executed. 


^sun 

Xr microsystems 


Revision A of 9 May 1988 





390 Programming Utilities and Libraries 


A.14. sccsdiff — 

Display Differences 
in sees Versions 

sccsdiff Options 


Files Used 

Diagnostics from sccsdiff 

A.15. unget — Undo a 
Previous SeeS get 


unget Options 
Delta to be removed 

Suppress delta ID 
Retain gotten file 


sccsdiff compares two versions of an SCCS file and generates the differences 
between the two versions. Any number of SCCS files may be specified, but 
options apply to all files. 

-N 

sccsdiff -rSIDl -xSIDl [ -p ] [ -sn ] filename . . . 

s_^_:- 


-tSID? 

SIDl and SID2 specify the deltas of an SCCS file that are to be compared. 
Versions are passed to dif f in the order given. 

-p pipe output for each file through pr. 

-Srt 

n is the file segment size that dif f (1) will use. This is useful when the sys¬ 
tem load is high. 


/tmp/get????? 

Temporary files 

file : No differences 
If the two versions are the same. 

Unget undoes the effect of a get -e done prior to creating the intended new 
delta. If a directory is named, unget behaves as though each file in the direc¬ 
tory were specified as a named file, except that non-SCCS files and unreadable 
files are silently ignored. If a name of - is given, the standard input is read with 
each line being taken as the name of an SCCS file to be processed. 


f 


A 

unget [ -rSID ] 

' -s ] [ -n ] filename . . . 






Options apply independently to each named file. 

-r SID 

Uniquely identifies which delta is no longer intended. (This would have 
been specified by get as the “new delta”). The -r option is necessary 
only if two or more outstanding gets for editing on the same SCCS file were 
done by the same person (login name). A diagnostic results if the specified 
SID is ambiguous, or if it is necessary but omitted from the command line. 

-s Suppress displaying.the intended delta’s SID. 

-n Retain the gotten file — it is normally removed from the current directory. 
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A. 16. val — Validate 
sees File 


val Options 

Suppress error messages 

Delta number 

Compare module names 
Compare module types 
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val determines if the specified^/^ is an SCCS file meeting the characteristics 
specified by the optional argument list. Arguments to val may appear in any 
order. 


val “ 

or 

val [ -s ] [ -rSID ] [ -mname ] [ -ytype ] filename . . . 


val has a special argument, which means read the standard input until an 
end-of-file condition is detected. Each line read is independently processed as if 
it were a command line argument lish 

val generates diagnostic messages on the standard output for each command 
line and file processed and also returns a single 8-bit code upon exit as described 
below. 


Options apply independently to each named file on the command line. 

-s Silence diagnostic messages normally generated for errors detected while 
processing the specified files. 

-rSID 

The argument value ID (SCCS /Dentification String) is an SCCS delta 
number. A check is made to determine if the SID is ambiguous (for instance, 
-r 1 is ambiguous because it physically does not exist but implies 1.1,1.2, 
etc. which may exist) or invalid (for instance, -rl.O or -rl.1.0 are invalid 
because neither case can exist as a valid delta number). If the SID is valid 
and not ambiguous, a check is made to determine if it actually exists. 

-m name 

name is compared with the SCCS %M% keyword in file. 

-y type 

type is compared with the SCCS %Y% keyword in file. 

The 8-bit code returned by val is a disjunction of the possible errors, that is, can 
be interpreted as a bit string where (moving from left to right) set hits are inter¬ 
preted as follows: 
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Table A-4 Codes Returned from val Command 


Bit 

Meaning 

0 

missing file argument 

1 

unknown or duplicate option 

2 

cormpted sees file 

3 

can’t open file or file not sees 

4 

SID is invalid or ambiguous 

5 

SID does not exist 

6 

%Y%, -y mismatch 

7 

%M%, -m mismatch 


Note that val can process two or more files on a given command line and in turn 
can process multiple command lines (when reading the standard input). In these 
cases an aggregate code is returned — logical OR of the codes generated for each 
command line and file processed. 

Limitations of the val val can process up to 50 files on a single command line. Any number above 50 

Command produces a memory dump. 

what — Identify SCCS Files what finds SCCS identifying information within any specified file, what does 

not use any options, nor does it treat directory names and a name of (a lone 
minus sign) in any special way, as do other SCCS commands. 

what searches the given file(s) for all occurrences of the string @ (#), which is 
the replacement for the %Z % ID keyword (see get), what then displays what¬ 
ever follows that string until the first double quote ), ( greater than (>), 

backslash (\), newline, or (non-printing) NUL character. 

As an example, let’s begin with the SCCS file s .prog. c (a C program), which 
contains the following line: 

char id[ ] "%Z%%M%:%I%"; 

We then do the following get: 

hermes% get -r3.4 s.prog.c 

V__ 

md finally compile the resulting g-file to produce prog. o and a. out. Using 
what as follows then displays: 
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A.17. sees Files 


The string what searches for need not be inserted via an ID keyword of get_ 

it may be inserted in any convenient manner. 

This section discusses several topics that must be considered before extensive use 
is made of SCCS. These topics deal with the protection mechanisms relied upon 
by SCCS, the format of SCCS files, and the recommended procedures for auditing 
SCCS files. 


Protection 


SCCS relies on the capabilities of the operating system for most of the protection 
mechanisms required to prevent unauthorized changes to SCCS files (that is, 
changes made by non-SCCS commands). The only protection features provided 
directly by SCCS are the release lock flag, the release floor and ceiling fl ags, and 
the user list. 

New SCCS files created by admin are given mode 444 (read-only). It is best not 
to change this mode, as it prevents any direct modification of the files by non- 
SCCS commands. 

SCCS files should be kept in directories that contain only SCCS files and any tem¬ 
porary files created by SCCS commands. This simplifies protection and auditing 
of SCCS files. The contents of directories should correspond to convenient logi¬ 
cal groupings, for example, subsystems of a large project. 

SCCS files must have only one link (name). Commands that modify SCCS fifes 
do so by creating a temporary copy of the file (called the x-file), and, upon com¬ 
pletion of processing, remove the old file and rename the x-file. If the old file has 
more than one link, removing it and renaming the x-file would break the link. 
Rather than process such files, SCCS commands produce an error message. All 
SCCS files muir have names that begin with ‘s.’. 

When only one user uses SCCS, the real and effective user IDs are the same, and 
that user ID owns the directories containing SCCS files. Therefore, SCCS may be 
used directly without any preliminary preparation. 

However, in those situations in which several users with different user IDs are 
assigned responsibility for one SCCS file (for example, in large software develop¬ 
ment projects), one user (equivalently, one user ID) must be chosen as the 
‘owner’ of the SCCS files and as the one who will ‘administer’ them (for exam¬ 
ple, by using admin). 

This user is termed the SCCS administrator for that project. Because other users 
of SCCS do not have the same privileges and permissions as the SCCS adminis¬ 
trator, they are not able to execute directly those commands that require write 
permission in the directory containing the SCCS files. Therefore, a.project- 
dependent program is required to provide an interface to the get, delta, and, 
if desired, rmdel and ede commands. 

The interface program must-be owned by the SCCS administrator, and must have 
the set-user- ID on execution bit on (see chmod(l)), so that the effective user ID 
is the administrator’s user ID. This program’s function is to invoke the desired 
SCCS command and to cause it to inherit the privileges of the interface program 
for the duration of that command’s execution. In this manner, the owner of an 
SCCS file can modify it at will. Other users whose login names or group IDs are 
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in the user list for that file (but who are ttot its owners) are given the necessary 
permissions only for the duration of the execution of the interface program, and 
are thus able to modify the SCCS files only through the use of delta and, pos¬ 
sibly, rmdel and cdc. 

The project-dependent interface program, as its name implies, must be custom- 
built for each project. 


Layout of an SCCS File 


SCCS files are composed of lines of ASCII text arranged in six parts, as follows: 


Checksum 
Delta Table 


User Names 


Flags 


A line containing the ‘logical’ sum of all the characters of 
the file (not including this checksum itself). 

Inforrnation about each delta, such as its type, SID, date and 
time of creation, and commentary included. 

List of login names and/or group IDs of users who are 
allowed to modify the file by adding or removing deltas. 

Indicators that control certain actions of various SCCS com¬ 
mands. 


Descriptive Text Text provided by the user; usually a summary of the con¬ 
tents and purpose of the file. 


Body Actual text that is being administered by SCCS, intermixed 

with internal SCCS control lines. 


Detailed information about the contents of the various sections of the file may be 
found in sccsfile(5). In the following, the is the only portion of 

the file discussed. 


Because SCCS files are ASCII files, they may be processed by various commands: 
editors such as vi(l), text processing programs such as grep(l), awk(l), and 
cat(l), and so on. This is quite useful when an SCCS file must be modified 
manually (for example, when the time and date of a delta was recorded 
incorrectly because the system clock was set incorrectly), or when one wants to 
simply ‘look’ at the file. 

CAUTION Extreme care should be exercised when modifying SCCS files with non-SCCS 
commands. 


Auditing On rare occasions, perhaps due to an operating system or hardware malfunction, 

all or part of an SCCS file is destroyed. SCCS commands (like most commands) 
display an error message when a file does not exist. In addition, SCCS commands 
use the checksum stored in the SCCS file to determine whether a file has been 
corrupted since it was last accessed (has lost data, or has been changed). The 
only SCCS command which will process a corrupted SCCS file is admin with 
the -h or -z options. This is discussed below. 

SCCS files should be audited (checked) for possible corruptions on a regular 
basis. The simplest and fastest way to audit such files is to use admin with the 
-h option on them: 
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f - ■ .... . . . . - - . _ 

hern>es% admin -h s.filel s.file2 

bennes% admin —h directoryl directory2 ... 

If the new checksum of any file is not equal to the checksum in the first line of 
that file, the message 

corrupted file (co6) 

is produced for that file. This process continues until all files have been exam¬ 
ined. When examining directories (as in the second example above), the process 
just described does not detect missing files. A simple way to detect whether any 
files are missing from a directory is to periodically list the contents of the direc¬ 
tory (using ls(l)), and compare the current listing with the previous one. Any 
file which appears on the previous list but not the current one has been removed 
by some means. 

When a file has been corrupted, the appropriate method of restoration depends 
upon the extent of the corraption. If damage is extensive, the best solution is to 
restore the file from a backup copy. When damage is minor, repairing the file 
with your favorite text editor may be possible. If you do repair the file with the 
system’s text processing capabilities, you must use admin with the -z option 
to recompute the checksum to bring it into agreement with the actual contents of 
the file: 


hern>es% admin -z s.file 


After this command is executed on a file, any corruption which may have existed 
in that file will no longer be detectable. 
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B.l. New Features. 

Default Makefile. 

The State File . make. state 

Hidden Dependency Checking. 

Command Dependency Checking. 

Automatic sees Extraction. 

Tilde Rules Superceded. 

sees History Files 

Pattern Matching Rules: More Convenient than Suffix Rules 

Pattern Replacement Macro References. 

New Options. 

Support for Modula-2. 

Naming Scheme for Predefined Macros. 

New Special-Purpose Targets. 

New Implicit Rule for lint. 

Macro Processing Changes. 

Macros: Definition, Substitution, and Suffix Replacement. 

Improved ar Library Support. 

Lists of Members. 

Handling of ar’s Name Length Limitation. 

Target Groups. 

Clearing Definitions of Special Targets and Implicit Rules. 

B.2. Incompatibilities with Previous Versions of make.. 

New Meaning for -d Option. 

Dynamic Macros. 
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B.l. New Features 


Default Makefile 


make’s implicit rules and macro definitions are no longer hard-coded within the 
program itself. They are now contained in the default makefile 
/ usr/ include/make/default .mk. make reads this file automatically, 
unless there is a file in the local directory named default. mk. When you use 
a local default .mk file, you must add an include 
/usr/include/make/default .mk directive to get the standard implicit 
mles and predefined macros. 


The State File . make . state make also reads from a state file, . make . state in the directory. When the 

special-function target . KEEP_STATE is used in the makefile, make writes out 
a cumulative report for each target containing a list of hidden dependencies (as 
reported by compilation processors such as cpp), and the most recent rale used 
to build each target. The state file is very similar in format to an ordinary 
makefile. 


Hidden Dependency Checking When activated by the presence of the . KEEP_STATE target, make uses infor¬ 
mation reported from cpp, f 77, make, pc and other compilation commands, 
and performs a dependency check against any header files (or in some cases, 
libraries) that are incorporated into the target file. These "hidden" dependency 
files do not appear in the dependency list, and often do not reside in the local 
directory. 

When . KEEP_STATE is in effect, if any command line used to build a target 
changes between make runs (either as a result of editing the makefile or because 
of a different macro expansion), the target is treated as if it were out of date; 
make rebuilds it (even if it is newer that the files it depends on). 


Command Dependency 
Checking 
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Automatic sees Extraction 

Tilde Rules Superceded This version of make automatically runs sees get, as appropriate, when there 

is no rule to build a target file. A tilde appended to a suffix in the suffixes list 
indicates that secs extraction is appropriate for files having that suffix. There 
are no longer special versions of implicit rules that include commands to extract 
current versions of secs files. 

To inhibit or alter the procedure for automatic extraction of the current secs 
version, redefine the . SCCS_GET special-function target. An empty rule for this 
target inhibits automatic extraction entirely. 

secs History Files make no longer searches the current directory for secs history (s .) files. 

These files must now reside in an SCCS subdirectory. 


Pattern Matching Rules: 
More Convenient than Suffix 
Rules 


Pattern matching rules have been added to simplify the process of adding new 
implicit rules of your own design. A target entry of the form: 

tp %ts : dp %ds 
rule 

defines a pattern matching mle for building a target from a a related dependency 
file, tp is the target’s prefix; ts, its suffix, dp is the dependency’s prefix; ds, its 
suffix. The % symbol is a wild card that matches a contiguous string of zero or 
more characters appearing in both the target and the dependency filename. For 
example, the following target entry defines a pattern matching rule for building a 
trof f output file, ending in . tr from a file that uses the -ms macro package 
ending in .ms: 

%.tr: %.ms 

troff -t -ms $< > $@ 

With this entry in the makefile, the command: 

make doc.tr 
produces: 

hermes% make doc*tr 

troff -t -in3 doc.ms > doc.tr • 

L__J 


Using that same entry, if there is a file named doc 2 . ms the command: 
make doc2.tr 


produces: 







hermes% make 

doc2.tr 



troff -t -ms 

doc2'.ms > doc2.tr 



An explicit target entry overrides any pattern matching rule that might apply to a 
target. Pattern matching rules, in turn, normally override implicit rules. An 
exception to this is when the pattern matching rule has no commands in the rule 


A sun 

microsystems 


Revision A of 9 May 1988 




Appendix B — make Enhancements Summary 401 


portion of its target entry. In this case, make continues the search for a rule to 
build the target, and using as its dependency the file that matched the (depen¬ 
dency) pattern. 


attern Replacement Macro As with suffix rules and pattern matching rules, pattern replacement macro refer- 

References ences has been added to provide a more general method for altering the values of 

words in a specific macro reference than that already provided by suffix replace¬ 
ment in macro references. A pattern replacement macro reference takes the 
form: 

$ (macro :p %s=np %ns ) 

where p is an existing prefix (if any), s is an existing suffix (if any), np and ns are 
the new prefix and suffix, respectively, and % is a wild card character matching a 
string of zero or more characters within a word. The prefix and suffix replace¬ 
ments are applied to all words in the macro value that match the existing pattern. 
This feature is useful for prefixing the name of a subdirectory to each item in a 
list of files. For instance, the following makefile: 


✓ --—-—--—-- 

- \ 

SOURCES= x.c y.c z.c 


SUBFILES.o= $(SOURCES:%.c=subdir/%.o) 


all: 


echo $(SUBFILES.o) 





produces: 


hermes% make 

echo subdir/x.o subdir/y.o subdir/z.o 
subdir/x.o subdir/y.o subdir/z.o 


Please note that pattern replacement macro references should not appear on the 
dependency line of a pattern matching rule’s target entry. This produces unex¬ 
pected results. With the makefile: 


OBJECT= .o 

X: 

%: %.$(OBJECT:%o=%2) 

Cp $< $@ 


it looks as if make should attempt to build a target named, x from a file named 
X. Z. However, the pattern matching rule is not recognized; make cannot deter¬ 
mine which of the % characters in the dependency line apply to the pattern 
matching rule, and whith apply to the macro reference. Consequently, the target 
entry for x. Z is never reached. To avoid problems like this, you can use an 
intermediate macro on another line: 
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New Options 


Support for Modula-2 

Naming Scheme for 
Predefined Macros 


— 


OBJECT= .o 


ZMAC= $(OBJECT:%o=%Z) 


%: %.$(ZMAC) 


x: 


%: %$(ZMAC) 


Cp $< $@ 


_ 

-__ . J 


There are a number of new options: 

-d Display dependency-check results for each target processed. liisplays all 
dependencies that are newer, or indicates that the target was built as the 
result of a command dependency. 

-dd The same function as -d had in earlier versions of make. Displays a great 
deal of output about all details of the make run, including internal states, 
etc. 

-D Display the text of the makefile. 

-DD Display the text of the makefile, and of the default makefile being used. 

-p Print macro definitions and target entries. 

-P Report on dependency checks without rebuilding targets. 

This version of make contains predefined macros and implicit rules for compil¬ 
ing Modula-2 sources. 

The naming scheme for predefined macros has been rationalized, and the implicit 
rules have been rewritten to reflect the new scheme. The macros and implicit 
rules are upward compatible with existing makefiles. 

For example, there is now a macro called SUFFIXES, that contains the default 
entries for the suffixes list; the target entry for the default suffixes list looks like: 

.SUFFIXES: $(SUFFIXES) 

If you want to insert new suffixes at the head of the list, you can do so quite sim¬ 
ply as follows: 

.SUFFIXES: 

.SUFFIXES: .ms .tr $(SUFFIXES) 

Other examples include the macros for standard compilations commands: 

LINK. c Standard cc command line for producing executable files. 

COMP ILE. c Standard cc command line for producing object files. 
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■ —STATE When included in a makefile, this target enables hidden depen¬ 

dency and command dependency checking. In addition, make 
updates the state file . make . state after each run. 

. INIT and . DONE 

These targets can be used to supply commands to perform at 
the beginning and end, respectively, of each make run. 

. SCCS_GET This target contains the rule for extracting current versions 
from sees history files. 

New Implicit Rule for lint Implicit rules have been added to support incremental verification with lint. 
Macro Processing Changes A macro’s value can now be of virtually any length. 


New Special-Purpose Targets 

The . KEEP_STATE target should 
not be removed once it has been 
used in a make run. 


New Append Operator: += 

This operator appends a 1 SPACE I . followed by a word or 
words, onto the existing value of the macro. 

Conditional Macro Definitions: : = 

This operator indicates a conditional (targetwise) macro 
definition. A makefile entry of the form: 

target : = macro = value 

indicates that macro takes the indicated value while process¬ 
ing target and its dependencies. 

Suffix Replacement Precedence 

Substring replacement now takes place following expansion of 
the macro being referred to. Previous versions of make 
applied the substitution first, with results that were counterin¬ 
tuitive. 

Nested Macro References 

make now expands inner references before parsing the outer 
reference. So, a nested reference as in this example: 

CFLAGS-g = -I../include 
OPTION = -g 
$(CFLAGS$(OPTION)) 


Macros: Definition, 
Substitution, and Suffix 
Replacement 


now yields the value -I. . /include, rather than a null 
value, as it would have in previous versions. 

Cross-Compilation Macros 

The predefined macros HOST_ARCH HOST_MACH 
TARGET_ARCH and TARGET_MACH are available for use in 
cross-compilations. By default, the arch macros are set to the 
value returned by the arch command; the mach macros are 
set to the value returned by mach. 
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Improved ar Library Support 
Lists of Members 

Handling of ar’s Name Length 
Limitation 


Target Groups 


Clearing Definitions of Special 
Targets and Implicit Rules 


B.2. Incompatibilities with 
Previous Versions of 

make 

New Meaning for -d Option 
Dynamic Macros 


make automatically updates an ar libraiy member from a file having the same 
name as the member. Also, make now supports lists of members as dependency 
names of the form: 

lib.a-, lib.a (member member ...) 


make now copes with the 15-character member-name length limitation in ar. It 
now recognizes a member name that matches the first 15 characters of a filename 
as the member corresponding to the file. 

It is now possible to specify that a rule produces a set of target files. A + sign 
between target names in the target entry indicates that the named targets 
comprise a group. The target group’s rule is performed once, at most, in a make 
invocation. 

To clear the dependency list and rule for a special target, implicit rule, or any tar¬ 
get with a name beginning with ‘add a target entry to the makefile with no 
dependency list and no rule. For example, to clear a previous . DEFAULT rule, 
add the line: 






.DEFAULT; 


^ _ 


J 


to your makefile. 


The -d option now reports the reason why a target is considered out of date. 

Although the dynamic macros < and * were documented being assigned only for 
implicit rules and the . DEFAULT target, in some cases they actually were 
assigned for explicit target entries. The assignment action is now documented 
properly. 

The actual value assigned to each of these macros is derived by the same pro¬ 
cedure used within implicit rules (this hasn’t changed). This can lead to unex¬ 
pected results when they are used in explicit target entries. 

Even if you supply explicit dependencies, make doesn’t use them to derive 
values for these macros. Instead, it searches for an appropriate implicit rule and 
dependency file. For instance, if you have the explicit target entry: 

test: test.f 

echo $< 

and the files: test. c and test. f, you might expect that $< would be 
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assigned the value test. f. This is not the case. It is assigned test. c, 
because . c is ahead of . f in the suffixes list: 


hermes% make test 


echo test.c ' ' 





——___J 


For explicit entries, we recommend a strictly deterministic method for deriving a 
dependency name using macro references and suffix replacements. For example, 
you could use: $ @ . f instead of $< to derive the dependency name. To derive 
the basename of a . o target file, you could use the suffix replacement macro 
reference: $ (@ : . o=) instead of $*. 

When hidden dependency checking is in effect, the $ ? dynamic macro’s value 
includes the names of hidden dependencies, such as header files. This can lead to 
failed compilations when using a target entry such as: 


X: X. c 

$(CC) -o $@ $? 


and the file x. c #include’s header files. The workaround is to replace $ ? 
with $<. 
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15 
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process creation, execl () and exeev (), in SunOS pro¬ 
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nested make commands, described, 165 
new 

features in make, 399,405 

special targets for make, 403 
NEWLINE, 283 
newwin (), 286 
nl 0,286 
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